Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: Control what a CPU places in the L3 cache area with C++?  (Read 10780 times)

0 Members and 1 Guest are viewing this topic.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Control what a CPU places in the L3 cache area with C++?
« on: February 27, 2018, 08:38:11 AM »
Might not be possible, but curious for performance increasing reasons.

Can you write a program in a way that it checks a CPU to see whether it has L3 Cache and if it does to run the program in the fast L3 cache space vs off of system RAM which is slower to address?

I have 4 systems with L3 Cache  ( two newly acquired Phenom II x4 945 3Ghz with 6MB L3 cache CPU's that I installed into systems that use to be a Sempron 145 2.8Ghz single-core and another that use to be a Athlon 64 4450B 2.3Ghz dual-core ) + ( FX 8300 3.3Ghz and FX 8350 4.0Ghz 8 core systems with 8MB L3 cache. )

Maybe it its purely left to the CPU itself to decide what it places into the fast L3 cache vs system RAM but curious if there is a way to control this to get even better performance in a number crunching program?

Maybe "shaping" ( data length restrictions ) need to be performed to get something to fit within this 6MB or 8MB cache space so that the CPU places it there vs system RAM like a square peg that can fit the square hole so its then allowed and runs with it?

*Instead of buying a decommissioned quadcore xeon server that I was warned is noisy in the 1U rack mount flavor, I decided to upgrade 2 slow systems I hadn't been using to Phenom II x4 945 3.0Ghz CPU's for $40 per CPU used, and the power consumption isn't all that bad for performance to power cost ratio and they are relatively quiet and cool running, plus I could have two completely different windows environment operations with keyboard/mouse at the same time for automation stuff. So my need to have a data cruncher server, I sort of satisfied with pairing two desktop systems to run side by side with the best CPU the motherboards could handle inexpensively which was the Phenom II 945's. Now I am curious if I should write my programs in a way to utilize the L3 cache or it just does it all on its own hands off. My prior server was a HP Server tower that had two Opteron 2216 2.4Ghz dual cores acting as a quadcore server. This server was beyond its useful life as number cruncher as for it was too slow. For a short while I gutted it down to get rid of unneeded features to drop the power consumption as low as I could go, but it was a space heater that wasn't very efficient at number crunching. ( I removed the HDD and dual video Quadro FX4600 video cards and installed a Geforce4 mx 440 64mb video card and a 60GB SSD drive, but that was all that could be done to reduce wasted power consumption with it as for the fans were needed for cooling. )  https://www.cpubenchmark.net/cpu.php?cpu=AMD+Opteron+2216 Other benefit of a pair of systems to be used as servers is that I can also set up a small cluster with the pair if ever needed as well to combine their crunching, however just giving each system its own task is easier and more efficient most of the time as for communications over a cluster is limited to the speed of network etc and it makes sense for a single system of combined computing where you have more than just 2 systems but more working together as a team.

Geek-9pm


    Mastermind
  • Geek After Dark
  • Thanked: 1026
    • Gekk9pm bnlog
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #1 on: February 27, 2018, 09:39:25 AM »
General reference:
https://en.wikipedia.org/wiki/CPU_cache
Quote
All modern (fast) CPUs (with few specialized exceptions[2]) have multiple levels of CPU caches. The first CPUs that used a cache had only one level of cache; unlike later level 1 caches, it was not split into L1d (for data) and L1i (for instructions). Almost all current CPUs with caches have a split L1 cache. They also have L2 caches and, for larger processors, L3 caches as well. The L2 cache is usually not split and acts as a common repository for the already split L1 cache. Every core of a multi-core processor has a dedicated L2 cache and is usually not shared between the cores. The L3 cache, and higher-level caches, are shared between the cores and are not split. An L4 cache is currently uncommon, and is generally on dynamic random-access memory (DRAM), rather than on static random-access memory (SRAM), on a separate die or chip. That was also the case historically with L1, while bigger chips have allowed integration of it and generally all cache levels, with the possible exception of the last level. Each extra level of cache tends to be bigger and be optimized differently.


Have you  already looked in the Intel place?
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/cache-allocation-technology-white-paper.pdf

https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/745751

For 64 bit CPU there are specific things you need to know.
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

For the AMD there's some variation, but I have not read the manual.  :-[
https://support.amd.com/TechDocs/24593.pdf

Does that help any?

EDIT: From  AMD:
System Resources
45
24593—Rev. 3.29—December 2017 AMD64 Technology
See “Page-Protection Checks” on page 145 for information on the  page-protection mechanism.
Alignment Mask (AM) Bit. 
Bit 18. Software enables automatic alignment checking by setting the
AM bit to 1 when RFLAGS.AC=1. Alignment checking can be disable
d by clearing either AM or RFLAGS.AC to 0. When automatic alignment checking is enabled and CPL=3, a memory reference to an unaligned operand causes an alignment-check exception (#AC). Not Writethrough (NW) Bit.  Bit 29. Ignored. This bit can be set to 1 or cleared to 0, but its value is ignored. The NW bit exists
only for legacy purposes.
Cache Disable (CD) Bit. 
Bit 30. When CD is cleared to 0, the internal caches are enabled. When CD
is set to 1, no new data or instructions are brought into the internal caches. However, the processor still accesses the internal caches when CD = 1 under the following situations:

Reads that hit in an internal cache cause the data to be read from the internal cache that reported the hit.

Writes that hit in an internal cache cause the cache line that
reported the hit to be written back to memory and invalidated in the cache.
Cache misses do not affect the internal caches when CD = 1. Software can prevent cache access by setting CD to 1 and invalidating the caches.

BC_Programmer


    Mastermind
  • Typing is no substitute for thinking.
  • Thanked: 1140
    • Yes
    • Yes
    • BC-Programming.com
  • Certifications: List
  • Computer: Specs
  • Experience: Beginner
  • OS: Windows 11
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #2 on: February 27, 2018, 12:01:15 PM »
You cannot really control what memory get's cached or where it get's cached. You can only write software in a way that makes it Cache-friendly:

What is Cache-friendly Code
I was trying to dereference Null Pointers before it was cool.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #3 on: February 27, 2018, 08:19:17 PM »
Thanks BC for that link. I read through it and one of the external links sent me to a tutorials point youtube video that showed more information on this.

For some reason I was thinking that cache could be targeted to make use of an address range that was labelled and associated as cache, similar to how knowing a specific address you can perform memory calls & injection to fetch or write information to an address, and so if an address can be targeted then its known and so you can pass data to and from it. Analogy to my thought on this was say you have a system with a Hard Drive and a SSD, you can pass information to the SSD to have a faster read/write, but data that isnt critical to speed remain on the Hard Drive. I was thinking that maybe there was a function that could target cache directly, but its hands off and handled by the CPU only. So it seems that you can call a memory address directly to System RAM, but not cache. Cache is hands off and its a matter of keeping a program small so that the CPU might utilize its internal cache to contain it.

The program I am running is 553KB in size so I guess its pretty probable that the L3 is being utilized for it with 6MB L3 cache available at all 4 cores running 4 instances of the program because its single-threaded basically 4 instances of the program would consume 2.012MB of the 6MB. On the other processors that only have L1 and L2 cache it was likely that it was less efficient and hitting the System RAM more because 512KB per core of 2MB of L2 cache with 4 instances of the program running since its single-threaded the 553k program x4 doesnt fit within the L2 in entirety.

It looks like to make a program that makes better use of your cache memory, its a matter of keeping program size minimal and information that is called over and over again the CPU will pick up on this and run with it to place it into its cache at the 3 cache levels of processors with L1, L2, and L3 cache such as the AMD Phenom and FX processors that I have.

Been working on making a program more efficient and was thinking if I could get it into the L3 cache it would run much more efficiently with lesser wasted clock cycles. I've even thought about porting it to Linux and run it from a distro that doesnt have a GUI because Windows itself have processing overhead which is waste. Also did some reading into methods of utilizing GPU's for mathematical crunching but havent found any examples that show an easy way to tap into a GPU to process a program vs CPU.

I have two GTX 570 video cards and a GTX 780 Ti. Knowing that GPUs are better at crunching numbers ( The crypto currency farming was using them for a while to point this greater efficiency out ) .... it might be the better way to go, but I have yet to find any examples that show how to load a program into GPU and have it crunch away.

Project i am working on is sort of looking for needles in a hay stack. Its all out of curiosity on a project i have that shuffles 89 characters randomly and seeded random where the seed is a key to scramble and unscramble information. For the fact that its probable to have a perfect shuffle where such as with a deck of cards you could shuffle a deck and have it shuffle back to order in which the deck was purchased of 2,3,4,5,6,7,8,9,10, J, Q, K, A for each suit, I have had an interest in with a long long int used as a seed, the frequency at which weak keys occur as well as to hunt down each of the worst keys that shuffle back to close to or the exact original order. There isn't enough processing power in the world to run for every combination of 89 characters ( permutations of 89 ) in a shuffle. Here is an interesting link on permutations of 52 ( deck of cards ) https://www.quora.com/How-many-combinations-can-a-deck-of-52-cards-make

From the program that I use the best method to avoid a weak key is to just test the key before its used for strength which takes a fraction of a second. Currently the curiosity for searching for weak keys I put in a starting value and end value to run to and tell it the flag value (threshold) to which if so many or more characters match between start location of array 1 and destination location of array 2 per iteration it writes to file that key value so I can check it out further to see how bad it is. At looking for 15 characters or more that shuffle back to their original order of 89 I have run through 10 Billion keys so far with no hits. If I drop the threshold to 8 they start popping up here and there, so I know that the program is doing what its designed to do vs not reporting due to a flaw in code. Its just that 15 characters or more to be shuffled back to their original position is extremely rare according to the first 10 Billion tested and nothing found yet.

Performance of the Phenom II x4 945 3.0Ghz CPU is pretty good. I am able to test 1 Billion keys per 2 hours and 17 minutes with 250 million keys per core and 4 instances if the single-threaded program running with core affinity set to tag each instance to its own core.

The project is mainly a curiosity and not an insanity. And it acts as a small space heater during the cooler months of the year in which its currently 30F / -1C outside :)

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #4 on: February 28, 2018, 12:18:03 AM »
In relation to what i discussed in prior post I think I found what I need to learn to make use of the nVidia video cards for crunching and get the program to run on the GPU's that I have. Sharing this here in case anyone else has interest in this sort of thing.

https://developer.nvidia.com/how-to-cuda-c-cpp

BC_Programmer


    Mastermind
  • Typing is no substitute for thinking.
  • Thanked: 1140
    • Yes
    • Yes
    • BC-Programming.com
  • Certifications: List
  • Computer: Specs
  • Experience: Beginner
  • OS: Windows 11
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #5 on: February 28, 2018, 03:31:33 PM »
Of note is that trying to keep executing loops/code in the cache is an optimization technique that is covered in Michael Abrash's Graphics Programming Black Book. Of course that covers graphics programming but it has extensive coverage of CPU performance characteristics and optimization at the Assembly Level. It doesn't really cover modern systems (I think the Pentium was the latest at the time of it's release) but most of the underlying concepts are sound.

Worthwhile mention as well:

Quote
For the fact that its probable to have a perfect shuffle

And I think this might highlight some of the problems with your approach. Exactly how you shuffle is going to depend on the shuffle algorithm. How are you shuffling?

My own personal preference has been to shuffle using a sort algorithm:

Code: [Select]
var cards = Enumerable.Range(0, 51);
var shuffledcards = cards.OrderBy(a => Guid.NewGuid());

(C#). Basically instead of swapping elements around like Fischer-Yates, it just takes the sequence and associates a random GUID with each one, then sorts the elements using those random values. The standard Fischer-Yates algorithm isn't truly a fair shuffle because it's not possible to have it shuffle into the original order.

Despite this, I think for card shuffling and your specific approach, you have a much bigger issue and possibly something you haven't even considered- not enough entropy.

As you linked, there are 52! possible combinations of cards in a 52-card deck.

The issue is that most random number generators use a 32-bit seed. that means there are only 2^32 possible random number sequences, which means that of those 52! possible combinations, it would only be possible to generate around 4 billion of them, which is a teensy tiny fraction. In fact, the most elements you can fairly shuffle with a 32-bit seed is 12. Once you hit 13, there are too many possible sequences. Even the GUID I used above is only 122 bits of entropy, which is also a significantly small fraction of the total problem space.


52! is higher than 2^225 but smaller than 2^226, it would be necessary to have at least a 226 bit seed in order to be able to generate all possible permutations when shuffling a 52-card deck. Otherwise, you won't be able to see all possible sequences.
I was trying to dereference Null Pointers before it was cool.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #6 on: February 28, 2018, 04:41:31 PM »
Thanks for the info BC and very interesting...

Curious whats going on beyond seed value beyond 32-bit limit around the 4 Billion value.... do you think its rolling like an odometer back to 0 so that the outcome of a shuffle with seed of 0 and 4,000,000,001 become the same shuffle output?



BC_Programmer


    Mastermind
  • Typing is no substitute for thinking.
  • Thanked: 1140
    • Yes
    • Yes
    • BC-Programming.com
  • Certifications: List
  • Computer: Specs
  • Experience: Beginner
  • OS: Windows 11
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #7 on: February 28, 2018, 04:57:35 PM »
I'm not sure I understand what you mean. If you are using a 32-bit seed (eg. int) then there isn't anything "beyond" it. The size of the seed is dictated by the algorithm. You typically cannot use srand() for example with anything but a 32-bit seed.

The Seed dictates the sequence of random numbers. If a random number algorithm only accepted say a byte seed, then there could only be 256 unique sequences. that could ever be generated by the algorithm. Asking what happens in that case if you give it a seed larger than 8 bits wouldn't really make sense. It's like asking what happens if you store 8TB on a 1TB Hard drive.

Perhaps using that smaller example can make more sense. Let's say we have a Random number algorithm that takes a byte as a seed. So we've got 256 unique sequences of random numbers- 0 gives us one sequence, 1 gives us another, etc.

If you have a sequence of 5 elements, All the possible permutations are represented, because there are 120 possible configurations of those 5 elements.

if you add a 6th element, however, there are 720 possible permutations. Since you only have 256 possible sequences with that limited random number generator, you can only generate 256 possible permutations of those 6 numbers using that random number generator.


There are arguably some workarounds for it that can extend it. For example the sequence with seed 6 could "skip" some early values in the sequence and instead start later in that sequence to (hopefully) get a sequence that isn't represented in the first 6 values of any other sequence, but it is very unreliable.

I was trying to dereference Null Pointers before it was cool.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #8 on: February 28, 2018, 07:04:03 PM »
Hello BC... as per

Quote
I'm not sure I understand what you mean. If you are using a 32-bit seed (eg. int) then there isn't anything "beyond" it. The size of the seed is dictated by the algorithm. You typically cannot use srand() for example with anything but a 32-bit seed.

I was thinking that beyond the 32-bit limit it might loop back to the beginning of the 32-bits like an odometer or like rolling the score back to 0 in asteroids in my last chat here etc, so a seed value of -1 is equal to a seed value of 2147483647 and seed value of 0 is equal to a seed value of 2147483648 and a seed value of 1 is equal to 2147483649. Butchering code to turn it into a loopable shuffle display it proved that it does in fact loop back / roll over back to 0 like an odometer. I supplied the code below in case you want to check it out further.

Additionally, when trying values exceeding 32-bit limit of 2147483647  my program that until today didnt have this limitation, but by which I added a rule now to not allow a value greater than it, caused a roll over of the output to be the same as the beginning but the spill over since the input is greater than 32-bits caused havoc with the next user input. The nature of the spill over is that with 2147483648 ( which is 1 too many ) it starts to scroll u's across the screen like uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu uuuuuuuuuuuu until I kill the program and for 2147483649 it starts to scroll a character that is none of the 89 characters that my program supports input of so it starts to scroll error messages telling the user its an invalid character and to enter a valid character in an endless loop.

Thank you for pointing out this seed limitation as for I wasnt aware of the 32-bit limit of it and well at least I picked a better time of year (winter) to use a computer as a space heater when unknowlingly til today that as the values input beyond 2147483647 are just looping around back to 0 so its an odometer that keeps rolling over and there are no shuffles that go back to the original order in accordance with the 32-bit seed limitation.  :)

Do you know if programming for a 64-bit OS with a IDE able to create 64-bit programs if the seed would then support 64-bit value or is the seed strictly 32-bits no matter of 32 or 64 bit compiled program?  Most people use a seeded time so 32 bits is plenty for user input delay assisted random, so I could see how maybe it wasnt changed to permit a larger 64-bit seed value.  ;D


Quote

                      Enter Integer Seed:


-1


unCrypt String Key =

18cQUHTZ7Rt4<jAzi3"Je:x/,h]Vk>yGWCXSw#p2%96'D0OE-N@IF+Y~?l(a\qnB&v.[!=5KsmdMob)f
*ug^Pr_L$;


Run again 1 = Yes and 0 = No
1
                      Enter Integer Seed:


0


unCrypt String Key =

_Ui?$dopSQ[O"lfLn<Nrv-\j!MJ/4%3Cy6IZk7APY>0R,wqm@&H8e'cx2)z.=^#u(Dg;*9h:a5KX]TtE
s~WbG1VBF+


Run again 1 = Yes and 0 = No
1
                      Enter Integer Seed:


1


unCrypt String Key =

MBJxAb7~[tS!go)zapwRIv#%$HG?FE</3j5nZD]Q0&ONY4@TU8=hVc>Pyiks.mu9,\6WlX('f:-KLC"q
e1_2^d;r+*


Run again 1 = Yes and 0 = No
1
                      Enter Integer Seed:


2147483647


unCrypt String Key =

18cQUHTZ7Rt4<jAzi3"Je:x/,h]Vk>yGWCXSw#p2%96'D0OE-N@IF+Y~?l(a\qnB&v.[!=5KsmdMob)f
*ug^Pr_L$;


Run again 1 = Yes and 0 = No
1
                      Enter Integer Seed:


2147483648


unCrypt String Key =

_Ui?$dopSQ[O"lfLn<Nrv-\j!MJ/4%3Cy6IZk7APY>0R,wqm@&H8e'cx2)z.=^#u(Dg;*9h:a5KX]TtE
s~WbG1VBF+


Run again 1 = Yes and 0 = No
1
                      Enter Integer Seed:


2147483649


unCrypt String Key =

MBJxAb7~[tS!go)zapwRIv#%$HG?FE</3j5nZD]Q0&ONY4@TU8=hVc>Pyiks.mu9,\6WlX('f:-KLC"q
e1_2^d;r+*


Run again 1 = Yes and 0 = No


Code: [Select]
#include <cstdlib>
#include <iostream>
#include <algorithm>
#include <string>

using namespace std;

int main(int argc, char *argv[])
{
    long long int seed1=0;
    int run=1;

while(run==1){
    cout<<"                      Enter Integer Seed: \n"; // Asks user to input integer seed
    cout<<"\n\n";
    cin >> seed1; // Input user seed
    cout<<"\n\n"<<"unCrypt String Key =\n\n";

   

    //Initialize Valid Characters for String Shuffle Output
    //Note Bug corrected with blank space for \ by use of escape character proceeding
    //string str="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890!@#$%^&*()_-+=?<>:\\/~.,;"; OLD v 1.02
    string str="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890!@#$%^&*()_-+=?<>:\\/~.,;[]\"\'"; // New v 1.10
    //String random comparison 1 for 1 match
   
    //Pass static string list to array prior to shuffle
    string tmp1 = str; //Pass str output to string tmp
    char tab3[128]; // Memory Allocation for array population
    strncpy(tab3, tmp1.c_str(), sizeof(tab3)); //string copy tmp into tab3 array
    tab3[sizeof(tab3) - 1] = 0;
   

    srand(seed1); // Allows user custom seeded starting algorithm position for random
   
    random_shuffle(str.begin(), str.end()); // Shuffle the string
    cout << str << "\n\n\n"; // Output the shuffle sequence
   

   cout<<"Run again 1 = Yes and 0 = No\n";
   cin>>run;
}
    return EXIT_SUCCESS;
}

BC_Programmer


    Mastermind
  • Typing is no substitute for thinking.
  • Thanked: 1140
    • Yes
    • Yes
    • BC-Programming.com
  • Certifications: List
  • Computer: Specs
  • Experience: Beginner
  • OS: Windows 11
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #9 on: February 28, 2018, 09:28:30 PM »
I was thinking that beyond the 32-bit limit it might loop back to the beginning of the 32-bits like an odometer or like rolling the score back to 0 in asteroids in my last chat here etc, so a seed value of -1 is equal to a seed value of 2147483647 and seed value of 0 is equal to a seed value of 2147483648 and a seed value of 1 is equal to 2147483649. Butchering code to turn it into a loopable shuffle display it proved that it does in fact loop back / roll over back to 0 like an odometer. I supplied the code below in case you want to check it out further.

in C/C++ if you add/subtract outside of an integer values range it will underflow or overflow. To use my byte example, 0-255 is the unsigned range. -128 to 127 is the signed range. Adding 5 to an unsigned byte of 255 gives you 4. There is no such thing as a value of 260 as a byte; nor is there a such thing as a value of -5 as an unsigned byte. Add one to a signed byte of 127 and you get -128. Same deal- there is no such thing as a signed byte with a value of 128. It's not intelligent logic being applied- it's not designed to go, "well, this is too big. I'll just give a different number by wrapping around" it's more a result of the way the CPU does arithmetic. As I understand it for integer operations if a carry bit is present after it's processed all bits of the operands it sets an overflow bitflag that can be checked by software. the CPU The CPU just does the arithmetic and sets the overflow flag if the software cares enough to check. C/C++ doesn't care on it's own.


Quote
Additionally, when trying values exceeding 32-bit limit of 2147483647  my program that until today didnt have this limitation

2147483647 is the maximum positive value of a signed 32-bit integer. For a signed integer there are another 2147483648 available values below zero since the range is -2147483648 to 2147483647.

srand takes an unsigned integer, though, not a signed one. The long long int you use to hold the seed value, doesn't get truncated until you call srand() with it, at which point it get's cast/truncated (I get a compiler warning about this with your program, actually). Since it is an unsigned parameter you would need you'd have to go beyond 4294967295 or below 0, and since you aren't actually printing out the seed that is being used, it's not immediately obvious.

You can observe this directly though. The output from 4294967295 and -1 is identical, for example. so is 4294967298 and 2. The value can be represented as a long long int, but when it's passed to srand it get's forced to a unsigned integer by ignoring higher-order bits, I expect. By way of example, the 16-bit integer 256 is 00000001 00000001, cast to a 8-bit integer would force it as 00000001 or a byte with value 1. The same logic is applied when casting a 64-bit long long int to a 32-bit int- any data in the higher bits is lost and stripped off so you get "wrapping" behaviours.


Quote
Do you know if programming for a 64-bit OS with a IDE able to create 64-bit programs if the seed would then support 64-bit value or is the seed strictly 32-bits no matter of 32 or 64 bit compiled program?  Most people use a seeded time so 32 bits is plenty for user input delay assisted random, so I could see how maybe it wasnt changed to permit a larger 64-bit seed value.  ;D

No- this has nothing to do with the bit width of the CPU or the Operating System. srand() accepts a 32-bit unsigned integer, and I think it has done so more or less since the standard library was original created. It would be necessary to write your own random number generator (or find one already implemented) that allows for larger seed values.
I was trying to dereference Null Pointers before it was cool.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: Control what a CPU places in the L3 cache area with C++?
« Reply #10 on: March 01, 2018, 11:03:49 AM »
Cool.. Thanks for making it more clear BC  8)