[Flent-users] [tohojo/flent] packet loss stats (#106)

Dave Täht notifications at github.com
Sat Nov 11 20:36:02 EET 2017


Pete:

I don't know enough about OSX to say why your packet jitter is so bad, I
will ask a few applers about what could be done to make go's timers
there better.

Pete Heist <notifications at github.com> writes:

> Nice, so this namespace stuff seems to work. :) Comments on the numbers:
>
> * I wish github would allow markdown in email, meanwhile you may be able to edit
>   the post and add triple backticks. I pasted to vim for viewing. 
> * It might be interesting to see your first result (localhost) on the same
>   hardware without the namespaces stuff, to compare, just straight irtt client -
>   i 1ms -d 10s -q localhost. 

I'll do one and paste that to github when I get back.

> * Your max server processing time in two of the cases was around 800us, which is
>   pretty high compared to the min of 2us. I'd like to understand this. I'm
>   trying to understand Go scheduling better and its possible effect on those
>   maximums. Perhaps I can use the mentioned "scheduler tracing" to figure
>   something out. 
> * Two of the max send call times were also ~2ms, which is pretty high. 

This was not an idle box but a desktop doing other stuff. I imagine that
the max timeslice handed out to other processes can be quite high,
especially given modern browsers fire off so many internal threads.

I am running with CONFIG_HZ_1000 in the kernel.

We can try the chrt command to apply realtime SCHED_RR on invocation. 

> * For me, I'm thinking of adding to the "Max" column the seqno that the max
>   value came from. It may often be 1, as things "warm up".

ARP can hurt. Toke at least used to apply a tiny amount of filtering to
filter out startup noise. A lot (more) academic types tend to throw out
stuff above the 98th percentile. I don't mind, out of a dataset
consisting of a 1000+ points - throwing out a very early or last anomalous
result, but 98th percentile is way too much.

There might be ways to pull memory allocation and paging out of the
noise if you preallocate enough memory to hold everything and exercise
things like mmap to make sure everything stays locked in memory.

An strace might be revealing as to the syscalls you are making and could
try to optimize out. sar can show the context switches...

(I am away from desk, and can do these things, too, when I get back to
it, but enjoy teaching folk to fish and then eating the results. :))

> Results can change
>   (maxes can go down) by just running the client a second time without
>   restarting the server. 
> * In your second result "15mbit/1mbit emulated dsl link with atm framing with
>   10ms delay each way", is the egress and ingress reversed? I expect you'd want
>   1mbit up, 15mbit down, unless you want to simulate hosting a server with ADSL.
>   Just making sure that the higher delay is happening on the slower direction of
>   the link. 

I wish we had not this fundamental semantic confusion between up and
down. Bucky fuller tried to transmorgify our thinking to in and out, but
that only works when gravity is in play. X11 gets it weird, too.

Yes, I think I reversed things on these tests from your perspective.

> * Regarding packet sizes, those can be reduced in a couple of ways, if desired: 
>
>   * -rs count cuts per-packet loss results, which is harmless for these tests,
>     and in fact I may want it to be the default (cuts 8 bytes from packet) 
>   * -clock wall cuts monotonic timestamps, which if you're running on the same
>     box with a stable clock might be ok (cuts 16 bytes from packet), or might
>     not. See below for more info. 
>
> * I went to test the irtt server with -goroutines > 1 to see how that affects
>   things. Somehow then there are occasional duplicates and corrupt packets, as
>   if more than one goroutine gets the same packet from a read call, which should
>   be impossible. I'll track that down, but the default of 1 goroutine per
>   listener on the server works fine.

k.

>
> Regarding wall vs monotonic clocks, there is a clock command in irtt to test the
> difference between wall and monotonic (along with bench and sleep commands,
> which might be interesting). There are significant differences between OS/X El
> Capitan and Debian 9.2 (4.9.0-4) in wall vs monotonic clock behavior:
>
> irtt clock El Capitan:
>
>          Monotonic              Wall   Wall-Monotonic   Wall Drift / Second	
>              150ns             150ns               0s                    0s
>       1.000319918s      1.000288918s            -31µs              -30.99µs
>       2.000515928s      2.000484928s            -31µs             -15.496µs
>       3.002998444s      3.002935444s            -63µs             -20.979µs
>        4.00815177s       4.00808877s            -63µs             -15.717µs
>       5.013422057s      5.013327057s            -95µs             -18.949µs
>       6.016402531s      6.016307531s            -95µs              -15.79µs
>       7.016763433s      7.016636433s           -127µs             -18.099µs
>       8.017018334s      8.016859334s           -159µs             -19.832µs
>        9.01720692s       9.01704792s           -159µs             -17.632µs
>
> irtt clock Debian 9.2:
>
>          Monotonic              Wall   Wall-Monotonic   Wall Drift / Second	
>            1.732µs           1.743µs             11ns            6.351039ms
>       1.000309146s      1.000309488s            342ns                 341ns
>       2.000719155s      2.000719447s            292ns                 145ns
>       3.000926549s      3.000926842s            293ns                  97ns
>       4.001145273s      4.001145607s            334ns                  83ns
>       5.001344771s       5.00134505s            279ns                  55ns
>       6.001557317s      6.001557649s            332ns                  55ns
>         7.0017664s      7.001766734s            334ns                  47ns
>       8.001980031s      8.001980322s            291ns                  36ns
>       9.002189828s      9.002190088s            260ns                  28ns

In glibc linux, at least, the get_time and gettimeofday calls do not
involve a context switch, as time is kept in a globally shared page
constantly updated by the kernel. musl doesn't do that (at least not
when I last checked), and I suspect, judging from your ns vs us result,
that el capitan doesn't either. There was a wonderful web page that went
and measured these differences in behavior once, but I can't find it
now.

OSX may well make available a harder timer system than what go uses,
since it is so often used for realtime work.


-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/tohojo/flent/issues/106#issuecomment-343685082
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flent.org/pipermail/flent-users_flent.org/attachments/20171111/018f23d9/attachment-0002.html>


More information about the Flent-users mailing list