Home / Microsoft / Microsoft DOS / how to get the Count of string in file
0 Members and 2 Guests are viewing this topic. « previous next »
Pages: 1 ... 6 7 [8] - (Bottom) Print
Author Topic: how to get the Count of string in file  (Read 7671 times)
ghostdog74
Mentor



Thanked: 26
Posts: 1,511


« Reply #105 on: August 15, 2010, 08:58:04 PM »

Ghost,
Your grep works
of course. Its better than using sed, which you proclaim is the "best".

Quote
. I had an old 2005 version.
time to change. We are not living in olden times anymore

Quote
Your skill level has improved. Who is your Tutor?
i have been playing with *nix since ancient times. my tutor is greg and bill rich, now its you and vishu...
IP logged

BC_Programmer
Mastermind


Thanked: 697
Posts: 15,874

Computer: Specs
Experience: Beginner
OS: Windows 7


Pinkie Pie is best pony

BC-Programming.com 1 1
« Reply #106 on: August 15, 2010, 09:10:35 PM »

I reiterate my point. Size of a file does not matter if what you are comparing is the result of the output between to 2 pieces of code.
That's what I said.

Quote
you want to make sure the output produced by the 2 pieces of code are the same.
Agree.

Quote
Size of file does matter in a benchmark, when you are concerned about the way the program is written and the algorithm used. That's is whether you have use the most optimized method when dealing with big files.
 

Yes, it does. but only if you perform a <single> benchmark. ST did two. therefore there are two points of reference and as I noted a linear formula can be derived from those two data points that roughly approximates a short range of the values of whatever the actual relationship between them are. A third data point will be enough to create a parabola, but, that doesn't mean that the performance relationship is a parabola, it's just all you can do with 3 points. It could very well be a cubic function of the size of the input.

The thing is, here, we can <SEE> the code. we can see why, right off, a larger file would make a difference. It doesn't matter if that larger file is larger by a million or a billion bytes, it's still larger and that difference is reflected thusly in the timings, and the reason is rather obvious as partially evidenced by your quick mention of it.


Quote
Because of the size of the file, you have chosen to read the files in chunks.
That's a direct consequence of taking size into consideration when designing your program. That's why size does matter in a benchmark. 1 million is way different 1 billion!

Oh, yes, of course, because everybody knows that you can't read in chunks for both 1 million and 1 billion. I obviously specifically designed it for the exact size that ST gave, I was in no way trying to make it more generic and efficient for smaller files (which it is, even a 128K file will benefit from chunk reading because it causes less stress on the task allocator and also causes less process memory fragmentation).

If you want to get right down to it, all benchmarks are flawed because of the timing code, it changes the results by being there, but you can't get results without it. the difference is that that benchmark code surrounds all the different timed blocks and therefore that fact can be ignored in the results.


I will agree that there are certainly instances where a million and a billion are a significant difference algorithm-wise, but at the same time, is not even the slightest floating point error in an algorithm a huge difference when it comes to the algorithms for trigonometric functions? It's a matter of the goal of the code in question as to exactly what constitutes a significant difference. In this case, because essentially ST was testing a large file (that was all I considered, I wasn't making sweeping design changes based on the fact that it was in millions as opposed to billions, but rather generic changes where it won't matter wether it was a million or a billion. Will the timing be different for a billion and a million? Of course it will. And I will agree that in that sense, the results are flawed. But you assume that my changes are based on his results, when in fact they are merely based on the simple premise that it doesn't work properly for large files. I didn't pay very close attention to the specific timings of them, because all I needed to know was that it was slower with larger files. I didn't need to know how many milliseconds it took to process with X characters.
IP logged

My Blog

BASeBlock 2.3.0 (NOW WITH MACGUFFINS!)
Salmon Trout
Sage



Thanked: 546
Posts: 7,948

Computer: Specs
Experience: Beginner
OS: Unknown

1
« Reply #107 on: August 16, 2010, 12:14:19 AM »

You showed a benchmark between BCP and your code, then says BCP's one is sluggish after a while without stating your reasons and conclusion of your findings.

I assumed that it was obvious. Sorry you missed it.
IP logged


Proud to be European
Salmon Trout
Sage



Thanked: 546
Posts: 7,948

Computer: Specs
Experience: Beginner
OS: Unknown

1
« Reply #108 on: August 16, 2010, 12:20:38 AM »

ST said he download "1 million places of pi", then his file name for testing the benchmark is "1 billion places of pi".

That is absolutely true, I did, but I did then give the file size immediately after.

Quote
(1,000,000,002 bytes) with no carriage returns.

A billion places of decimals? No CRs? One byte per character? One each for the 3 and the decimal point, and 1,000,000,000 for the decimal places.


IP logged


Proud to be European
ghostdog74
Mentor



Thanked: 26
Posts: 1,511


« Reply #109 on: August 16, 2010, 01:30:17 AM »

That is absolutely true, I did, but I did then give the file size immediately after.
so now why don't you go edit your post and correct the typo? change million to billion.
IP logged

BC_Programmer
Mastermind


Thanked: 697
Posts: 15,874

Computer: Specs
Experience: Beginner
OS: Windows 7


Pinkie Pie is best pony

BC-Programming.com 1 1
« Reply #110 on: August 16, 2010, 01:48:05 AM »

so now why don't you go edit your post and correct the typo? change million to billion.

A:) because he can't

and B:) it doesn't really matter.

I mean, come on:

Quote
I downloaded a text file containing 1 million places of pi (1,000,000,002 bytes)

It doesn't take a rocket scientist to see that the bracketed value is in fact 1 billion and 2. Just because this confused you doesn't make it ambiguous, especially since it was later referenced as a billion. In fact it is only noted as a million in the single quoted passage. The fact that you are now throwing up a shitestorm because of a obvious typo that is in no way ambiguous (it's clearly a billion, especially, you know, given the file size is a billion)
IP logged

My Blog

BASeBlock 2.3.0 (NOW WITH MACGUFFINS!)
ghostdog74
Mentor



Thanked: 26
Posts: 1,511


« Reply #111 on: August 16, 2010, 03:01:31 AM »

A:) because he can't
why can't?
Quote
and B:) it doesn't really matter.
yes it does. Especially when you are proofing something. A typo is a typo and it should be corrected. If not, its not clear
someone might think he meant a million and all his billions are wrong. isn't that so?
IP logged

Salmon Trout
Sage



Thanked: 546
Posts: 7,948

Computer: Specs
Experience: Beginner
OS: Unknown

1
« Reply #112 on: August 16, 2010, 03:32:19 AM »

so shall we go through GD74's posts looking for typos? The spirit of Billrich has deeply impregnated this thread.
IP logged


Proud to be European
ghostdog74
Mentor



Thanked: 26
Posts: 1,511


« Reply #113 on: August 16, 2010, 04:17:55 AM »

so shall we go through GD74's posts looking for typos?
go ahead if you are too bored. I don't care. If there are anything i am trying to proof and there are typos , i would be glad to amend it.
Quote
The spirit of Billrich has deeply impregnated this thread
don't associate me with that guy.  If you want to do that, look at yourself in the mirror and tell me why you are any different
IP logged

Salmon Trout
Sage



Thanked: 546
Posts: 7,948

Computer: Specs
Experience: Beginner
OS: Unknown

1
« Reply #114 on: August 16, 2010, 04:55:04 AM »

go ahead if you are too bored. I don't care. If there are anything i am trying to proof and there are typos , i would be glad to amend it. don't associate me with that guy.  If you want to do that, look at yourself in the mirror and tell me why you are any different

offensive; reported
IP logged


Proud to be European
CBMatt
Mod & Malware Specialist
Prodigy



Thanked: 160
Posts: 6,033

Experience: Experienced
OS: Windows 7


Sad and lonely...and loving every minute of it.

1
« Reply #115 on: August 16, 2010, 05:03:29 AM »

Not quite sure what is so offensive about the comment in question, but this discussion has obviously gone far beyond the original intent of the thread.  In the future, a bit more maturity and less arguing would be preferred.  Topic locked.
IP logged

Quote
An undefined problem has an infinite number of solutions.
—Robert A. Humphrey

Actually, the name's Chris...
Pages: 1 ... 6 7 [8] - (Top) Print 
Home / Microsoft / Microsoft DOS / how to get the Count of string in file « previous next »
 


Login with username, password and session length

Old Forum Search | Forum Rules
Copyright © 2010 Computer Hope ® All rights reserved.
Powered by SMF 2.0 RC3 | SMF © 2006–2010, Simple Machines LLC
Page created in 0.146 seconds with 19 queries.