Upvote Upvoted 0 Downvote Downvoted
tf2 stuttering..
posted in Q/A Help
1
#1
0 Frags +

This thing is really annoying me...
I don't understand why this game always gives me lag or stuttering..
GTA5 requires more performance than tf2...
my pc can run GTA5 with no lag and stuttering... it always gives me nearly or over 60 fps
and my pc spec is
i5 6500 8 gb ram gtx 1060

This thing is really annoying me...
I don't understand why this game always gives me lag or stuttering..
GTA5 requires more performance than tf2...
my pc can run GTA5 with no lag and stuttering... it always gives me nearly or over 60 fps
and my pc spec is
i5 6500 8 gb ram gtx 1060
2
#2
5 Frags +

Tf2 just has super shitty optimization what dxlevel are you using?

Tf2 just has super shitty optimization what dxlevel are you using?
3
#3
3 Frags +

try out cfg.tf there are dozens of configs you can try out.

try out cfg.tf there are dozens of configs you can try out.
4
#4
5 Frags +
majh0GTA5 requires more performance than tf2...

Not quite.
GTA5 requires a good CPU with multiple cores and a good GPU.
TF2 just needs a GPU from this millenium and a single CPU core that's 4 times faster than what exists to get stable high fps.

[quote=majh0]
GTA5 requires more performance than tf2...[/quote]
Not quite.
GTA5 requires a good CPU with multiple cores and a good GPU.
TF2 just needs a GPU from this millenium and a single CPU core that's 4 times faster than what exists to get stable high fps.
5
#5
3 Frags +

@setsul does tf2 fps even scale with instructions per second (or however you choose to measure absolute processor speed)? it seems to be bound by something else past a certain point, like memory/cache speed or something

@setsul does tf2 fps even scale with instructions per second (or however you choose to measure absolute processor speed)? it seems to be bound by something else past a certain point, like memory/cache speed or something
6
#6
0 Frags +

@tripzxD I am using dx 9...

@tripzxD I am using dx 9...
7
#7
0 Frags +
Setsulmajh0GTA5 requires more performance than tf2...Not quite.
GTA5 requires a good CPU with multiple cores and a good GPU.
TF2 just needs a GPU from this millenium and a single CPU core that's 4 times faster than what exists to get stable high fps.

thanks for the information :D

[quote=Setsul][quote=majh0]
GTA5 requires more performance than tf2...[/quote]
Not quite.
GTA5 requires a good CPU with multiple cores and a good GPU.
TF2 just needs a GPU from this millenium and a single CPU core that's 4 times faster than what exists to get stable high fps.[/quote]

thanks for the information :D
8
#8
0 Frags +

the game gives me about 100~230 fps, but if there are a lot of spam

like pipes stickies and rockets

the fps goes down to 90 or even 50...

is it a usual thing in this game...? D;

the game gives me about 100~230 fps, but if there are a lot of spam

like pipes stickies and rockets

the fps goes down to 90 or even 50...

is it a usual thing in this game...? D;
9
#9
11 Frags +

#5
There's no absolute measure for CPU speed because out of order execution is rather complicated.
Memory latency is always a problem, that's why caches exist.
TF2 does really weird things. One of them is for example that the function that takes up by far the most time, 4 times as much as the next one, isn't bound by memory or width at all. It's slow because for some godforsaken reason it is riddled with instructions that need microcode. No one has used many of these in the last 20 years because everyone knows they are slow. Yet that function is full of it and they don't appear to be aligned. The compiler shouldn't do that. So there's some really weird dependency chains in there that require some very specific and very complicated instructions.
Quick explanation on how modern CPUs work:
Out of order execution means that if you want to calculate x = (a + b + c + d) and use instructions that basically do e = (a + b), f = (c + d), x = (e + f) then when a isn't found in the cache and you'd have to wait to get it from memory the CPU instead does f = (c + d) first, since it knows that these instructions are independent. Of course it gets a bit more complicated than that but the idea is instead of only waiting and doing nothing (and with a DRAM access taking 100 cycles, L3 30 cycles, L2 10-15 cycles and L1 caches being only 32kB to get that 2-4 cycle acces time you're going to wait a lot) you do something else that's independent of the data you're waiting for.
Now of course the CPU is only allowed to reorder instructions, not change them. So if your program is e = (a + b), f = (e + c), x = (f + d) then congratulations, it'll run like shit.
That's one of the things that appear to be happening.

The other is microcode. x86 got a lot of "interesting" instructions. For example loading data and adding something to it in the same instruction is possible. It's solved by simply decoding into an internal instruction that splits into two right before the execution, one for the load, one for the add. But there's more complicated ones that do about a million things. For those there's microcode. Instead of hardwiring instructions that generate 30 internal ones (micro Ops / µOps / uops) it's saved like a program. The decoder sees one of those instructions, everything stops, the CPU looks it up in some internal memory and goes through the corresponding uops one by one.
So the standard decoder setup for Intel used to be 4 decoders. Ideally they take 4 instructions per cycle and decode them. There's some restrictions for that though, but let's not think about that.
The problem is what happens with instructions >4 uops that use microcode. Now with microcode you can still get 4 uops per cycle (4 decoders = also 4 uops) so it seems like it would be the same speed at first glance, maybe slightly slower since it might not be a number of uops divisible by 4. It's either or, either the normal decoders are used or microcode. That means if instruction 1 is "normal", 2 is microcoded, 3 normal, 4 microcoded and so on then what happens is you get instr 1 in the first cycle, but 2 can't be decoded because it needs microcode and the decoders are in use. So instr 2 only starts decoding in cycle 2. Let's say it generates 5 uops, that means you get 4 in cycle 2, one 1 cycle 3. But instr 3 won't start decoding until cycle because again, can't use both at the same time. Suddenly your max of 4 uops/cycle is actually down to 2.

Now no game is going to run at 4 instructions per cycle anyway, but combine this with weird dependency chains, everything depending on the previous instruction (see above) then suddenly you wait a lot for no damn reason.

#8
Yes, that is indeed how it works.
Lower the settings and/or get an fps config.

#5
There's no absolute measure for CPU speed because out of order execution is rather complicated.
Memory latency is always a problem, that's why caches exist.
TF2 does really weird things. One of them is for example that the function that takes up by far the most time, 4 times as much as the next one, isn't bound by memory or width at all. It's slow because for some godforsaken reason it is riddled with instructions that need microcode. No one has used many of these in the last 20 years because everyone knows they are slow. Yet that function is full of it and they don't appear to be aligned. The compiler shouldn't do that. So there's some really weird dependency chains in there that require some very specific and very complicated instructions.
Quick explanation on how modern CPUs work:
Out of order execution means that if you want to calculate x = (a + b + c + d) and use instructions that basically do e = (a + b), f = (c + d), x = (e + f) then when a isn't found in the cache and you'd have to wait to get it from memory the CPU instead does f = (c + d) first, since it knows that these instructions are independent. Of course it gets a bit more complicated than that but the idea is instead of only waiting and doing nothing (and with a DRAM access taking 100 cycles, L3 30 cycles, L2 10-15 cycles and L1 caches being only 32kB to get that 2-4 cycle acces time you're going to wait a lot) you do something else that's independent of the data you're waiting for.
Now of course the CPU is only allowed to reorder instructions, not change them. So if your program is e = (a + b), f = (e + c), x = (f + d) then congratulations, it'll run like shit.
That's one of the things that appear to be happening.

The other is microcode. x86 got a lot of "interesting" instructions. For example loading data and adding something to it in the same instruction is possible. It's solved by simply decoding into an internal instruction that splits into two right before the execution, one for the load, one for the add. But there's more complicated ones that do about a million things. For those there's microcode. Instead of hardwiring instructions that generate 30 internal ones (micro Ops / µOps / uops) it's saved like a program. The decoder sees one of those instructions, everything stops, the CPU looks it up in some internal memory and goes through the corresponding uops one by one.
So the standard decoder setup for Intel used to be 4 decoders. Ideally they take 4 instructions per cycle and decode them. There's some restrictions for that though, but let's not think about that.
The problem is what happens with instructions >4 uops that use microcode. Now with microcode you can still get 4 uops per cycle (4 decoders = also 4 uops) so it seems like it would be the same speed at first glance, maybe slightly slower since it might not be a number of uops divisible by 4. It's either or, either the normal decoders are used or microcode. That means if instruction 1 is "normal", 2 is microcoded, 3 normal, 4 microcoded and so on then what happens is you get instr 1 in the first cycle, but 2 can't be decoded because it needs microcode and the decoders are in use. So instr 2 only starts decoding in cycle 2. Let's say it generates 5 uops, that means you get 4 in cycle 2, one 1 cycle 3. But instr 3 won't start decoding until cycle because again, can't use both at the same time. Suddenly your max of 4 uops/cycle is actually down to 2.

Now no game is going to run at 4 instructions per cycle anyway, but combine this with weird dependency chains, everything depending on the previous instruction (see above) then suddenly you wait a lot for no damn reason.

#8
Yes, that is indeed how it works.
Lower the settings and/or get an fps config.
10
#10
7 Frags +

h-h-h-h-have you restarted your computer?

h-h-h-h-have you restarted your computer?
11
#11
0 Frags +

Thanks for the information about cpu... But its really hard for me... xD

Thanks for the information about cpu... But its really hard for me... xD
Please sign in through STEAM to post a comment.