Like said the 4000 series has 4 sets of pipelines, so the OS sees it as a 4 core. They (module) just share some upperend and the L2 cache. Alot dfferent the Intels HT, which runs 2 threads on the same pipelines. The OS just sees it as a 4 core, not as a dual core module.
Its better with the patch, but not just right. Like say if your running 2 threads. If they dont share any cache, its better to run each thread on one core of different modules. If they share cache, it better to run both threads on the same module since each core on a module share L2 cache. In other words Windows is have a hard time understanding what thread to run on what core.
The 6000 and 8000 series will run 4 threaded games and so on (slightly) better then the 4000 series at the same ghz. For the reason above, the 6000 and 800 can spread the threads out on modules/cores. Unlike the 4000 that has to bunch all them up on 4 cores/2 modules.