Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: how to get the Count of string in file  (Read 35524 times)

0 Members and 1 Guest are viewing this topic.

ghostdog74



    Specialist

    Thanked: 27
    Re: how to get the Count of string in file
    « Reply #105 on: August 15, 2010, 08:58:04 PM »
    Ghost,
    Your grep works
    of course. Its better than using sed, which you proclaim is the "best".

    Quote
    . I had an old 2005 version.
    time to change. We are not living in olden times anymore

    Quote
    Your skill level has improved. Who is your Tutor?
    i have been playing with *nix since ancient times. my tutor is greg and bill rich, now its you and vishu...

    BC_Programmer


      Mastermind
    • Typing is no substitute for thinking.
    • Thanked: 1140
      • Yes
      • Yes
      • BC-Programming.com
    • Certifications: List
    • Computer: Specs
    • Experience: Beginner
    • OS: Windows 11
    Re: how to get the Count of string in file
    « Reply #106 on: August 15, 2010, 09:10:35 PM »
    I reiterate my point. Size of a file does not matter if what you are comparing is the result of the output between to 2 pieces of code.
    That's what I said.

    Quote
    you want to make sure the output produced by the 2 pieces of code are the same.
    Agree.

    Quote
    Size of file does matter in a benchmark, when you are concerned about the way the program is written and the algorithm used. That's is whether you have use the most optimized method when dealing with big files.
     

    Yes, it does. but only if you perform a <single> benchmark. ST did two. therefore there are two points of reference and as I noted a linear formula can be derived from those two data points that roughly approximates a short range of the values of whatever the actual relationship between them are. A third data point will be enough to create a parabola, but, that doesn't mean that the performance relationship is a parabola, it's just all you can do with 3 points. It could very well be a cubic function of the size of the input.

    The thing is, here, we can <SEE> the code. we can see why, right off, a larger file would make a difference. It doesn't matter if that larger file is larger by a million or a billion bytes, it's still larger and that difference is reflected thusly in the timings, and the reason is rather obvious as partially evidenced by your quick mention of it.


    Quote
    Because of the size of the file, you have chosen to read the files in chunks.
    That's a direct consequence of taking size into consideration when designing your program. That's why size does matter in a benchmark. 1 million is way different 1 billion!

    Oh, yes, of course, because everybody knows that you can't read in chunks for both 1 million and 1 billion. I obviously specifically designed it for the exact size that ST gave, I was in no way trying to make it more generic and efficient for smaller files (which it is, even a 128K file will benefit from chunk reading because it causes less stress on the task allocator and also causes less process memory fragmentation).

    If you want to get right down to it, all benchmarks are flawed because of the timing code, it changes the results by being there, but you can't get results without it. the difference is that that benchmark code surrounds all the different timed blocks and therefore that fact can be ignored in the results.


    I will agree that there are certainly instances where a million and a billion are a significant difference algorithm-wise, but at the same time, is not even the slightest floating point error in an algorithm a huge difference when it comes to the algorithms for trigonometric functions? It's a matter of the goal of the code in question as to exactly what constitutes a significant difference. In this case, because essentially ST was testing a large file (that was all I considered, I wasn't making sweeping design changes based on the fact that it was in millions as opposed to billions, but rather generic changes where it won't matter wether it was a million or a billion. Will the timing be different for a billion and a million? Of course it will. And I will agree that in that sense, the results are flawed. But you assume that my changes are based on his results, when in fact they are merely based on the simple premise that it doesn't work properly for large files. I didn't pay very close attention to the specific timings of them, because all I needed to know was that it was slower with larger files. I didn't need to know how many milliseconds it took to process with X characters.
    I was trying to dereference Null Pointers before it was cool.

    Salmon Trout

    • Guest
    Re: how to get the Count of string in file
    « Reply #107 on: August 16, 2010, 12:14:19 AM »
    You showed a benchmark between BCP and your code, then says BCP's one is sluggish after a while without stating your reasons and conclusion of your findings.

    I assumed that it was obvious. Sorry you missed it.

    Salmon Trout

    • Guest
    Re: how to get the Count of string in file
    « Reply #108 on: August 16, 2010, 12:20:38 AM »
    ST said he download "1 million places of pi", then his file name for testing the benchmark is "1 billion places of pi".

    That is absolutely true, I did, but I did then give the file size immediately after.

    Quote
    (1,000,000,002 bytes) with no carriage returns.

    A billion places of decimals? No CRs? One byte per character? One each for the 3 and the decimal point, and 1,000,000,000 for the decimal places.



    ghostdog74



      Specialist

      Thanked: 27
      Re: how to get the Count of string in file
      « Reply #109 on: August 16, 2010, 01:30:17 AM »
      That is absolutely true, I did, but I did then give the file size immediately after.
      so now why don't you go edit your post and correct the typo? change million to billion.

      BC_Programmer


        Mastermind
      • Typing is no substitute for thinking.
      • Thanked: 1140
        • Yes
        • Yes
        • BC-Programming.com
      • Certifications: List
      • Computer: Specs
      • Experience: Beginner
      • OS: Windows 11
      Re: how to get the Count of string in file
      « Reply #110 on: August 16, 2010, 01:48:05 AM »
      so now why don't you go edit your post and correct the typo? change million to billion.

      A:) because he can't

      and B:) it doesn't really matter.

      I mean, come on:

      Quote
      I downloaded a text file containing 1 million places of pi (1,000,000,002 bytes)

      It doesn't take a rocket scientist to see that the bracketed value is in fact 1 billion and 2. Just because this confused you doesn't make it ambiguous, especially since it was later referenced as a billion. In fact it is only noted as a million in the single quoted passage. The fact that you are now throwing up a shitestorm because of a obvious typo that is in no way ambiguous (it's clearly a billion, especially, you know, given the file size is a billion)
      I was trying to dereference Null Pointers before it was cool.

      ghostdog74



        Specialist

        Thanked: 27
        Re: how to get the Count of string in file
        « Reply #111 on: August 16, 2010, 03:01:31 AM »
        A:) because he can't
        why can't?
        Quote
        and B:) it doesn't really matter.
        yes it does. Especially when you are proofing something. A typo is a typo and it should be corrected. If not, its not clear
        someone might think he meant a million and all his billions are wrong. isn't that so?

        Salmon Trout

        • Guest
        Re: how to get the Count of string in file
        « Reply #112 on: August 16, 2010, 03:32:19 AM »
        so shall we go through GD74's posts looking for typos? The spirit of Billrich has deeply impregnated this thread.

        ghostdog74



          Specialist

          Thanked: 27
          Re: how to get the Count of string in file
          « Reply #113 on: August 16, 2010, 04:17:55 AM »
          so shall we go through GD74's posts looking for typos?
          go ahead if you are too bored. I don't care. If there are anything i am trying to proof and there are typos , i would be glad to amend it.
          Quote
          The spirit of Billrich has deeply impregnated this thread
          don't associate me with that guy.  If you want to do that, look at yourself in the mirror and tell me why you are any different

          Salmon Trout

          • Guest
          Re: how to get the Count of string in file
          « Reply #114 on: August 16, 2010, 04:55:04 AM »
          go ahead if you are too bored. I don't care. If there are anything i am trying to proof and there are typos , i would be glad to amend it. don't associate me with that guy.  If you want to do that, look at yourself in the mirror and tell me why you are any different

          offensive; reported

          CBMatt

          • Mod & Malware Specialist


          • Prodigy

          • Sad and lonely...and loving every minute of it.
          • Thanked: 167
            • Yes
          • Experience: Experienced
          • OS: Windows 7
          Re: how to get the Count of string in file
          « Reply #115 on: August 16, 2010, 05:03:29 AM »
          Not quite sure what is so offensive about the comment in question, but this discussion has obviously gone far beyond the original intent of the thread.  In the future, a bit more maturity and less arguing would be preferred.  Topic locked.
          Quote
          An undefined problem has an infinite number of solutions.
          —Robert A. Humphrey