Language:
switch to room list switch to menu My folders
Go to page: First ... 98 99 100 101 [102] 103 104 105 106 ... Last
[#] Mon May 30 2011 15:35:36 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Sorry, dothebart... I just needed to clean things up so it'd be easier to pull into another environment.  I had a lot of other crap that didn't need to be there (e.g. precompiled header, junk organized in a win32-specific way, etc).



[#] Mon May 30 2011 15:36:57 EDT from fleeb @ Uncensored

Subject: Re:

[Reply] [ReplyQuoted] [Headers] [Print]

 

Mon May 30 2011 10:44:10 EDT from saltine @ Uncensored
Maybe your code is small enough that it remains inside the data and code buffers of the cpu, too?

Maybe.  If so, that's still kind of cool.



[#] Mon May 30 2011 19:31:19 EDT from LoanShark @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

thought I'd check on the code's performance.  I had to resort to a
very high resolution timer, since I wasn't working with a huge amount
of data.  I was pleased with the results... it seemed very fast

when microbencharking... if the code is too small to measure in milliseconds, you may want to just run it through a loop about 10K times...

[#] Mon May 30 2011 19:40:19 EDT from LoanShark @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

That's great. The kind of appalling trick we all love and our bosses hate during code review. ;)

[#] Mon May 30 2011 19:52:24 EDT from LoanShark @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

buffer, and discovered my multithreaded approach took exponentially
more time than a single-threaded approach I tried (possibly because I

had over 100 of these copies to do, and I tried to do them all at the

same time instead of perhaps 2 at a time... not sure).

Indeed. The first thing you need to do, when performing CPU-intensive multithreaded work, is make sure you have 1 thread per CPU (or HyperThread virtual cpu. Context switching is a *big deal* for that sort of job, because not only do you have to save register state when switching, but you also have TLB flushes and increased inter-core cache coherence dependencies (communciation costs, generally) and a lot more stuff that won't fit into L1/L2 cache.

The next thing you need to do is find a good work-stealing framework, such as Java's ForkJoinPool (I'm sure there's an analogue in the C++ world, I just don't know what) and learn how to use it. This means that units of work have to be recursively subdivided into smaller chunks that can be executed in parallel, but not too small or else the overhead of scheduling the chunks will be greater than the benefits.


If you tried to do this for extremely trivial tasks such as "adding up a list of numbers" in a languagte like Java without a decent macro facility, you'd find that it'll actually slow your code down unless you write a fair bit of boilerplate, hand-inlined code. In order for to easily realize the benefits of parallelism, each individual unit of work needs to be fairly large, at least 100 primitive computational steps. "1+1" as in adding up a list of numbers is not enough work.

[#] Mon May 30 2011 20:15:17 EDT from LoanShark @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

such as Java's ForkJoinPool (I'm sure there's an analogue in the C++
world, I just don't know what) and learn how to use it. This means that


Perhaps unsurprisingly, apparently it's in boost: http://lists.boost.org/Archives/boost/2008/09/142801.php

[#] Mon May 30 2011 20:29:31 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Yeah... I did not look deeply into that particular problem, since I was in a hurry.  I just noticed when I went single-threaded for that task, performance increased significantly, so I went with it.  I doubt I really had over 100 threads at the same time (I'm crazy, but I'm not completely stupid), but I might have had something analogous to that happening accidentally.  Again, though, multi-threading is very hard.  I prefer to use only one thread and asynchronous I/O instead.

And, yeah, it doesn't surprise me that boost already has something like that.  Hell, I was using boost::thread in the first place.



[#] Tue May 31 2011 07:54:45 EDT from fleeb @ Uncensored

Subject: Re:

[Reply] [ReplyQuoted] [Headers] [Print]

 

Mon May 30 2011 19:31:19 EDT from LoanShark @ Uncensored
thought I'd check on the code's performance.  I had to resort to a
very high resolution timer, since I wasn't working with a huge amount
of data.  I was pleased with the results... it seemed very fast

when microbencharking... if the code is too small to measure in milliseconds, you may want to just run it through a loop about 10K times...

Duh, I should have thought to do that.  Still, this gives a relatively decent idea of how fast it works without so much churn.

Heh... 'microbench-arking'...



[#] Tue May 31 2011 10:57:54 EDT from IGnatius T Foobar @ Uncensored

Subject: Re:

[Reply] [ReplyQuoted] [Headers] [Print]

The original question asked was something to the effect of, if you have one function which produces data and another which consumes it, which one should be the caller and which should be the callee?

It would seem that such a program would be better served by putting *neither* function in charge, and instead have the main loop call each of them in turn, passing buffers of data back and forth as appropriate.

Why is this approach not discussed? (And no, I wasn't really serious about going multithreaded, though it poses some interesting things to think about.)

[#] Tue May 31 2011 11:45:26 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Er, actually, that is what he's doing (and the problem he is trying to solve).

He's just kind of involved with some of the semantics of it, at least in terms of making the program look more legible (which is also a concern).  Because code should be maintainable, and relatively easy to read.  I suppose, in the end, he's found this lovely gray area where 'easy to read' means two different things.  It could be easy-to-read as in 'follow all the C-code instructions to understand how it works' or easy-to-read as in 'follow the general logic to understand how it works without getting caught up in some of the ignorable details'.

In the first pair of code snippets, he shows the 'original' code.  It's nice and legible, but heavy and not portable.

So, he showed a conventional way to rewrite the two functions so that they do just what you suggested.  The problem is, the code isn't quite as legible as the original code... in fact, it's downright ugly.

That lead to the approach he favors, which involves using coroutines.  He steps you through the whys behind the resulting code, eventually getting to the relatively nice looking code that works for him.

Then, he points out what a dick your boss would be for not accepting this kind of code because he prefers the 'follow all the c-code instruction' perspective instead of the 'follow the basic logic' perspective, which is kinda cute.



[#] Tue May 31 2011 11:52:16 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

On a semi-related note, I found Tom Duff's comment on Duff's Device amusing:

http://www.lysator.liu.se/c/duffs-device.html



[#] Tue May 31 2011 13:03:16 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

I think I found one other thing slowing down the original implementation that I downloaded.

Originally, the author set the macro BUFFERSIZE to some large value, ostensibly because he's expecting you to run files through it, and I guess he figured they might be large files.

My equivalent setting is only 4096, so I changed BUFFERSIZE to 4096 and recompiled.  It seemed to speed things up for the original implementation, to the point where it runs faster than my implementation for decoding (although my implementation using strings instead of iostreams is still much faster).  This is more in line with my expectations.  My code is still faster at encoding, though, but not by much.



[#] Tue May 31 2011 16:49:18 EDT from LoanShark @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]


not really in boost. looks like the way to go for C++ developers who need fine-grained parallelism is http://threadingbuildingblocks.org/

[#] Tue May 31 2011 17:22:10 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Hmm... that's intriguing.  I should give that a closer looksee.



[#] Wed Jun 01 2011 04:53:31 EDT from dothebart @ Uncensored

Subject: Re:

[Reply] [ReplyQuoted] [Headers] [Print]

*gngn* where was the original discussion at? (shouldn't it belong into programming in first place?)

fleeb, I get this:

gcc base_xx.cpp -o test1
In file included from /usr/include/c++/4.5/unordered_map:35:0,
from base_64.h:18,
from base_xx.cpp:5:
/usr/include/c++/4.5/bits/c++0x_warning.h:31:2: error: #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options.
In file included from base_xx.cpp:5:0:
base_64.h:285:12: error: 'unordered_map' in namespace 'std' does not name a type
base_64.h:476:4: error: 'char_lookup_t' does not name a type
base_64.h: In member function 'int tvr::decode::base_64::decode_value(char)':
base_64.h:451:10: error: '_alphabet_map' was not declared in this scope
base_64.h:455:5: error: 'char_lookup_t' has not been declared
base_64.h:455:35: error: expected ';' before 'found'
base_64.h:456:10: error: 'found' was not declared in this scope
base_64.h:456:19: error: '_alphabet_map' was not declared in this scope
base_64.h: In member function 'void tvr::decode::base_64::build_map()':
base_64.h:470:6: error: '_alphabet_map' was not declared in this scope

 

and the other one... no main?

gcc cdecode.c -o test
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 0 has invalid symbol index 11
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 1 has invalid symbol index 12
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 2 has invalid symbol index 2
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 3 has invalid symbol index 2
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 4 has invalid symbol index 11
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 5 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 6 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 7 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 8 has invalid symbol index 2
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 9 has invalid symbol index 2
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 10 has invalid symbol index 12
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 11 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 12 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 13 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 14 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 15 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 16 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 17 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 18 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 19 has invalid symbol index 13
/usr/bin/ld.bfd.real: /usr/lib/debug/usr/lib/crt1.o(.debug_info): relocation 20 has invalid symbol index 20
/usr/lib/gcc/x86_64-linux-gnu/4.5.3/../../../../lib/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'



[#] Wed Jun 01 2011 07:34:10 EDT from fleeb @ Uncensored

Subject: Re:

[Reply] [ReplyQuoted] [Headers] [Print]

 

Wed Jun 01 2011 04:53:31 EDT from dothebart @ Uncensored Subject: Re:

*gngn* where was the original discussion at? (shouldn't it belong into programming in first place?)

I got very, very angry in programming a while ago, and zForgot it.  This was the next best place.

fleeb, I get this:

gcc base_xx.cpp -o test1
In file included from /usr/include/c++/4.5/unordered_map:35:0,
from base_64.h:18,
from base_xx.cpp:5:
/usr/include/c++/4.5/bits/c++0x_warning.h:31:2: error: #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options.

Yeah, I pointed out earlier that you might have a problem with my use of 'unordered_map'.  It is the one thing that was potentially difficult here, but I needed to use unordered_map for the hash table (faster than a binary table).  I could maybe create another kind of lookup table that doesn't require this, though, but it would potentially waste space instead.  If you have boost, change #include to <boost/unordered_map.hpp> (or whatever it is) and it should be just fine.  If you have tr1, use that instead.  Otherwise, try using -std=c++0x as the compiler suggests.

and the other one... no main?

I thought I got that... there's supposed to be a main in base_xx.cpp.  The DevStudio IDE love to use a version of main that provides for TCHARs instead of chars.  Just change whatever the IDE created to a normal main and it should be fine (I'm not using any command line arguments anyway).

 



[#] Wed Jun 01 2011 12:41:06 EDT from Ford II @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Maybe your code is small enough that it remains inside the data and
code buffers of the cpu, too?

I was thinking something like this. Perhaps more spinning variables end up in registers your way.

[#] Wed Jun 01 2011 13:06:27 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Well, the updated archive (with base32 and base16, not to mention some bugfixes) is on http://www.fleeb.com/base_xx if you're interested.  It corrects the weird TCHAR main thing, but I'm still using unordered_map (at least until I can find something else that's fast but more available).



[#] Wed Jun 01 2011 15:35:00 EDT from Ford II @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

On a semi-related note, I found Tom Duff's comment on Duff's Device
amusing:

The only problem with duff's device is that it really isn't all that useful. It's very cute, but that's about it.

[#] Wed Jun 01 2011 15:38:52 EDT from fleeb @ Uncensored

[Reply] [ReplyQuoted] [Headers] [Print]

Well... it is useful in a very, very specific way.  You can implement coroutines in C/C++ with them.  So the problems solved best by co-routines are best solved with Duff's Device.



Go to page: First ... 98 99 100 101 [102] 103 104 105 106 ... Last