Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: FPGA odd behavior  (Read 3720 times)

0 Members and 1 Guest are viewing this topic.

kokowang5699

    Topic Starter


    Rookie

    • Experience: Experienced
    • OS: Other
    FPGA odd behavior
    « on: November 13, 2016, 06:27:48 AM »
    I am programming a filter for a FPGA.
    The input is 14 bits, with registers arranged as [13:0], at a sample rate of 125 million samples per second.

    I have coded the I/O to be this, where x[n] is input and y[n] is output:
    aK,L[n] = x[n] − x[n − K] − x[n − L] + x[n − K − L]
    b[n] = b[n − 1] + aK,L[n], n ≥ 0
    c[n] = b[n] + MaK,L[n]
    y[n] = y[n − 1] + c[n], n ≥ 0

    There are 2 recursions needed, one at b[n] = b[n-1] + .... and one at y[n] = y[n-1] + ...
    The first recursion at b[n] works properly, with expected outputs. The second recursion, however, exhibits some odd behavior. There are random spikes in the outputs for about 20 nanoseconds then return to expected values. The width of the input is 14 bits and all operations are done on 64 bit registers, oscillations are at a frequency of 100000hz, no overflow should be happening. Even if you're not sure how to fix this, any suggestions on what's causing this or possible solutions for this would be greatly appreciated (I honestly have no idea what could possibly be happening). Thanks in advance!
    (Note: I should add that dividing c[n] by 1024 solves the issue, although I have no idea why. It does create new problems though, because of rounding errors, y[n] will approach to infinity if I divide c[n] by 1024. I am fine with dividing c[n] by 1024 if I can fix y[n] approaching infinity, as y[n] will simply be a trigger for the computer to capture x[n]).

    - The code for this will be released as open-source once a working version is developed. For now, I'd prefer to not release any major section of the code just yet. If you need a specific module to know what's causing this problem, then let me know and I'll post code for it if reasonable. c[n] is correct, y[n] is incorrect after recursion. Every operation in the equation (coefficients, +, -, *, /, ect...) represents a module.

    Additional information:
    Compiler - Vivado 2016.02 on MBP r13' 2015
    OS - computer:ubuntu 16.04  FPGASystem:debian Custom OS
    Input - Advanced Digital Cable.
    « Last Edit: November 13, 2016, 06:57:35 AM by kokowang5699 »

    DaveLembke



      Sage
    • Thanked: 662
    • Certifications: List
    • Computer: Specs
    • Experience: Expert
    • OS: Windows 10
    Re: FPGA odd behavior
    « Reply #1 on: November 13, 2016, 10:31:48 AM »
    Quote
    The second recursion, however, exhibits some odd behavior. There are random spikes in the outputs for about 20 nanoseconds then return to expected values

    Question I have is do these 20 ns spikes affect the outcome of the end result or is it just odd behavior while it runs, but otherwise the end result is fine?

    Reason why I ask is because I ran into some oddities in how a multiple core CPU handled some code before, where I expected all active threads to end at just about the same time, however they didnt. I had a multithreaded execution that when that part of the process was done, another process then grabbed this data and further refined it. What i ended up having to do is put in a delay so that all threads were done processing before running with the next step as the quick fix to my problem. Digging deeper as to why the threads were ending the task sooner than others, it came down to the fact that the OS itself while running was using slices of the multiple core system and this caused the execution of those threads to slow as for it wasnt given the constant green light to keep crunching the numbers, it got red lights the wait at as the OS was running other duties and until that was done and green light given to continue for that thread it added lag to that thread, which in the end caused them to end in unpredictable order. I played some with core affinity to set the OS to run only on a single core of a multiple core system and this helped some, but it really came down to that why isolate the OS to a single core of the quadcore, just add a time delay for whichever thread ends the soonest, and then with a time delay window large enough but not too long of a delay, all threads are done and then its able to then start the next part of the process with all threads at the finish line for the first part of the crunching before the end result is then used for the next step of the process.

    Im thinking your running into this delay ( 20 ns spike ) which may even act harmonic at times in addition to random in relation to how the OS is using the CPU and how your program is. One thing to try would be to set core affinity for set the OS to a single core and then other available cores used just for your processing and see if this changes this 20 ns spike issue.

    This is a situation where the code is likely sound and solid, but, the way the CPU is juggling the OS and your program is giving you this oddity. Here is more info on Core Affinity for Ubuntu: http://www.hecticgeek.com/2012/03/assign-process-cpu-ubuntu-linux/

    Setting Dedicated Cores with core affinity for your program might clean this up is my thoughts to this.

    Also through this process, does this system have plenty of RAM, or is RAM use 100% to where data is paging to HDD and then playing catch up to slip it back into the execution.

    kokowang5699

      Topic Starter


      Rookie

      • Experience: Experienced
      • OS: Other
      Re: DaveLembke
      « Reply #2 on: November 14, 2016, 04:47:33 AM »
      Sorry I think you misunderstand. The computer creating the FPGA is running ubuntu. There are error checks while synthesizing code, it's highly unlikely that the compiled code is wrong repetedly. (I compiled 7 slightly different versions attempting to fix the problem without doing c[n]/1024).
      The computer actually running the FPGA is running debian (redpitaya OS). The CPU is set to directly upload input and output to ethernet, it does nothing else. Calculations are done by hardware logic. The number of cores it's using shouldn't affect its ability to correctly display numbers.
      As for the FPGA, it's purely sequantial and alternating clock so no race conditions exist. If a logic series happens to take more than an 0.5 clock cycles, I'd be extremely surpised. I can test this by adding a delay between every module, but I doubt if it will change things. The problem occurs at the second accumulator module, which is identical to the first accumulator module. The reason why I say this is odd is because the first one works and the second one doesn't, despite them being literally the exact same.

      I am trying something right now that will hopefully fix it. Currently the circuit is in one line that goes like this:
      [13:0]
      [63:0] x 7
      [13:0]

      I will try rewiring in this fashion:
      [13:0]           [63:0] x 2
      [63:0] x 6     [13:0]

      No idea if it will work at the moment.
      The RAM usage as you suggested may also be a cause of this. I will disable some unneccesary modules and see if it fixes things. The ram is partitioned, I don't have the equipment to monitor the section dedicated to the fpga.

      kokowang5699

        Topic Starter


        Rookie

        • Experience: Experienced
        • OS: Other
        Update
        « Reply #3 on: November 14, 2016, 07:28:45 PM »
        Okay, I just tested the rewiring and it takes care of the spikes. But, of course, a new problem happens (why am I not surprised....).
        Lets cover problem first:
        Output is off by about 2048, easy fix with offset settings.

        Now, 2 more problems that I have no idea how to fix:
        1. Oscillations of ±80 to the original waveform, as if a sinusoidal wave is added on to the output waveform. This problem also occurred in simulator, so it may be an actual logic problem, let's not worry about that for now.
        2. Incorrect shaping of output waveform (suggests synthesized logic isn't an exact match of equation mentioned in first post).

        The simulated logic from the code matches the equations, I'll assume it's synthesis errors or hardware delays causing the incorrect shaping. Now rewiring circuit again, separating all modules this time. Hopefully this may be a usable version after synthesis. If not then I will empty out room (possibly get rid of the digital input filtering and decimation modules) for registers between each module. If anyone knows a better way to do this then please let me know. Thanks.

        DaveLembke



          Sage
        • Thanked: 662
        • Certifications: List
        • Computer: Specs
        • Experience: Expert
        • OS: Windows 10
        Re: FPGA odd behavior
        « Reply #4 on: November 14, 2016, 08:36:34 PM »
        Quote
        1. Oscillations of ±80 to the original waveform, as if a sinusoidal wave is added on to the output waveform. This problem also occurred in simulator, so it may be an actual logic problem, let's not worry about that for now.
        Quote
        2. Incorrect shaping of output waveform (suggests synthesized logic isn't an exact match of equation mentioned in first post).

        The help your looking for is slightly outside the scope of what most are able to assist with here. Your dabbling in an area that I am not a specialist with with the FPGA and your electronics. I have worked with DAC & DSP's but the wave forms were used for linear positioning with transducers etc. Automation and Microcomputer Electronics is where I have dabbled with in a past career and left that in 2001 after 6 years with Rockwell Automation. Electronics has come quite a ways since 2001. Sorry to say that I'm probably not going to be of much more help on this.

        Regarding #1, if its a fixed sinusoidal frequency that your getting that you want to eliminate, I would think that cancellation would be just a matter of a synchronous phase lock to it to remove it as synchronous noise removal. However if you can find the cause of this vs a phase lock cleanup that would be better than adding band aids. Problem with a synchronous phase lock cleanup is that while it would clean it up to remove this, you have a probability that for very small time intervals you could be cutting out the signal you want where the sine wave intersects your signal, and this could make for more problems.

        Regarding #2 ... thinking #1's noise is causing the shaping issues.  :-\

        Do you have a schematic available to take a look at of what electronics your using?