Wednesday, June 1st, 2011, 00:05 UTC | ||
[00:05:21] | danielk22 (danielk22!~danielk@96.57.9.142) has joined #mythtv | |
[00:20:41] | sunkan (sunkan!~sunkan@alva.zappa.cx) has joined #mythtv | |
[00:20:49] | aloril (aloril!~aloril@dsl-tkubrasgw3-fe7ef900-153.dhcp.inet.fi) has joined #mythtv | |
[00:22:31] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[00:36:58] | pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has quit (Quit: Leaving.) | |
[00:46:12] | gigem (gigem!~david@host103.16.intrusion.com) has joined #mythtv | |
[00:46:12] | gigem (gigem!~david@host103.16.intrusion.com) has quit (Changing host) | |
[00:46:12] | gigem (gigem!~david@mythtv/developer/gigem) has joined #mythtv | |
[00:46:38] | jpabq: | sphery, stuartm, regarding that "Failed to create OpenGL texture." problem, I noticed it only happens with some themes, and not others. No idea why. |
[00:47:21] | davide (davide!~david@host103.16.intrusion.com) has quit (Ping timeout: 240 seconds) | |
[00:49:45] | jpabq: | It happens even if I start mythfrontend with -O ThemePainter=qt |
[00:53:28] | sphery: | weird... got a list of themes? perhaps we can figure out what's different about them. |
[00:55:51] | jpabq: | It is happening on a theme I have been developing as a testbed for MythUI features. Must be something I am doing different/wrong... Just got to figure out what. |
[00:58:18] | GreyFoxx (GreyFoxx!~greg@2001:470:b12f:dead:beef::1) has joined #mythtv | |
[01:06:22] | Chutt (Chutt!~ijr@cpe-24-29-225-191.neo.res.rr.com) has joined #mythtv | |
[01:12:00] | jpabq: | Found it. Looks like I cut&paste <textarea name="alert_text"> from metallurgy at some point. I am guessing that is not supported any more. |
[01:17:38] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[01:36:07] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[01:42:11] | Dave123 (Dave123!~dave@cpe-74-74-200-106.rochester.res.rr.com) has joined #mythtv | |
[01:49:42] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[02:07:59] | markk: | jpabq: what are you working on with mythui? |
[02:18:57] | Captain_Murdoch: | markk, when testing remote DVD performance, are you playing back an ISO or a directory containing the VOBs, etc.? |
[02:19:54] | markk: | Captain_Murdoch: mostly ISOs |
[02:19:55] | Captain_Murdoch: | I wonder what it would do if we disabled the readahead thread on the BE. that's doable by setting the 3rd arg to the RemoteFile ctor to false. easy enough to test by changing that in fileringbuffer.cpp and mythiowrapper.cpp |
[02:21:30] | markk: | I was thinking about that – any potential issues ? |
[02:22:30] | Captain_Murdoch: | just potential slowdown. I've wondered several times over the years if we should let the OS handle buffering, read-ahead, etc. on the BE instead of trying to do it ourself. if NFS is fast enough, then we should be able to serve up blocks over mythproto at similar speeds. |
[02:25:20] | markk: | Captain_Murdoch: presumably I can test that just by changing the frontend code |
[02:26:20] | Captain_Murdoch: | not sure if it was you or not that I worked with earlier on it or not, but the reason I differentiate and have both raw RemoteFile and RingBuffer support in mythiowrapper.cpp is because it's costly to spin up a RingBuffer, so for small files it's quicker to just spin up a RemoteFile. |
[02:27:05] | markk: | yup – we discussed that a few weeks ago |
[02:27:31] | Captain_Murdoch: | yeah, just need to change those two RemoteFile() cases. force the third arg to false to disable the readahead on the BE. I don't know how much that code has been tested in non-write mode though, so hopefully it still works OK.I'm not sure if we disable it anywhere for reads. |
[02:27:48] | Captain_Murdoch: | s/tested/tested recently/ |
[02:27:58] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[02:33:19] | wagnerrp: | Captain_Murdoch: did you see the patch on trac where someone added frontend socket commands to tell it to flip blocking and not blocking shutdown of the backend? |
[02:33:34] | Captain_Murdoch: | saw the email come through but hadn't looked over the patch. |
[02:33:36] | wagnerrp: | i knew the ALLOW_SHUTDOWN and BLOCK_SHUTDOWN commands had existed in the past |
[02:33:46] | wagnerrp: | but i always thought they were for slave backends |
[02:33:59] | ** Captain_Murdoch can't recall where or if we use those anywhere. ** | |
[03:03:05] | jpabq: | markk, buttonlist stuff. |
[03:13:03] | markk: | jpabq: thanks – just curious:) |
[03:15:13] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[03:18:30] | markk: | Captain_Murdoch: turning off readahead thread on backend doesn't seem to help. |
[03:20:14] | taylorr: | markk: one snapshot I took was that it took 250 msec to read ~5MB of data during a matroska cluster read |
[03:20:14] | Captain_Murdoch: | markk, ok. it was a thought... :( |
[03:20:26] | taylorr: | not sure if that's very good performance or not |
[03:24:59] | markk: | Captain_Murdoch: for the dvd iso over remotefile case, I seem to have a workaround. by disabling the local readahead thread but enabling the readahead thread in libdvdnav, I seem to get fast startup and reasonable buffering. still some small buffering pauses – so still not ideal. but progress. |
[03:25:55] | Captain_Murdoch: | makes sense, as that code would know better what it needs to read ahead. |
[03:26:06] | Captain_Murdoch: | s/what/when/ |
[03:33:43] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[03:37:00] | taylorr: | markk: for Kyle Rose's video it took 1.104 secs to read 31MB of data |
[03:38:12] | markk: | taylorr: is that gigabit lan? |
[03:38:30] | taylorr: | so for 1.1.04 secs "zero" packets get sent to the decoder for decoding – if the video is 24 fps and we only have 17 video buffers then clearly we exhaust video buffers waiting for the decoder to produce more |
[03:38:38] | taylorr: | markk: yes it's a gigabit lan |
[03:39:29] | taylorr: | reading 31MB of data in 1.1 secs is over 200 Mb/s |
[03:39:50] | taylorr: | so I believe this is just the nature of how ffmpeg muxes matroska |
[03:40:23] | taylorr: | if I playback matroska that is muxed by another muxing program other than ffmpeg I don't see these large cluster reads |
[03:41:24] | taylorr: | I don't understand why nfs works though |
[03:42:31] | taylorr: | markk: could you run the video when it's playing back and look for the ReadPriv bursts, take the first and last ReadPriv message and see how long it takes and how much data is read |
[03:42:56] | taylorr: | you need -v file,extra |
[03:43:11] | taylorr: | or just post a log on pastebin |
[03:43:25] | taylorr: | would be great to see a failure and success log |
[03:43:35] | taylorr: | I just don't see anything we are doing wrong |
[03:44:48] | taylorr: | markk: a faster machine might handle this better as it can produce the checksum and parse the data much faster |
[03:48:36] | PointyPumper (PointyPumper!~pintlezz@190.244.73.13) has quit () | |
[03:51:28] | markk: | taylorr: ok – just need to clear out some local patches and cleanup. so do you want me to test with the galactica clip (kyle) ? |
[03:52:29] | taylorr: | sure, it seems worse than cars |
[03:54:13] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[03:56:11] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[03:56:16] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[04:03:05] | taylorr: | markk: interesting, it seems that ffplay doesn't have issues and it only buffers a maximum of 2 frames |
[04:10:16] | taylorr: | markk: I'm looking at matroska_parse_cluster() calls under mythffplay and they return much faster – definitely no where near the 1+ seconds that MythTV experiences |
[04:11:05] | taylorr: | so that points to ReadPriv() performance – and possibly in relation to safe_read() of a remote file |
[04:14:33] | markk: | taylorr: http://pastebin.ca/2073072- I had to trim the start. but it clearly shows a couple of reads that take about half a second – both of which are far slower than any other. not congestion etc as they always appear in the same spot. |
[04:15:46] | markk: | taylorr: search for 'took 480' and 'took 503' |
[04:16:37] | taylorr: | markk: I think the culprit is in the ReadPriv() path somewhere – we just need to look at what might impact performance there. |
[04:18:58] | taylorr: | markk: that is the safe_read taking 480ms – the problem seems to happen before that – look at the buffer status the line before "took 480ms" – it's out of video buffers |
[04:19:15] | taylorr: | the problem happened because the ReadPriv() calls collectively took too long |
[04:23:16] | markk: | taylorr: I'm not sure I'm following |
[04:24:07] | ** markk thinks that could be the general anaesthetic this morning – and the lack of food... ** | |
[04:25:38] | taylorr: | markk: look at the "Waited 106ms for video buffers" – that happened because the ReadPriv() calls took too long – those are called by libavformat/matroskadec.c's matroska_cluster _read_function |
[04:27:11] | taylorr: | basically we aren't copying the data from the frontend's readahead buffer to the matroska demuxer fast enough via the ReadPriv() calls |
[04:28:07] | taylorr: | maybe worth looking at the code in ringbuffer.cpp to see if anything in ReadPriv could cause much of a delay – it does quite a few checks |
[04:28:32] | taylorr: | maybe one of those checks would be slower if using a remote file? |
[04:28:53] | taylorr: | also might be worth looking at the locking |
[04:29:46] | markk: | yeah – locking looks like an obvious candidate |
[04:37:03] | taylorr: | markk: hmmm, maybe it is the safe_read taking too long |
[04:37:27] | taylorr: | does it hold a lock that prevents ReadPriv() from completing? |
[04:38:02] | markk: | taylorr: more than likely. |
[04:39:20] | taylorr: | markk: yes looks like it takes the rbwlock right before it runs the safe_read() in the readahead while loop |
[04:41:44] | taylorr: | nope, I don't know what I'm talking about |
[04:41:59] | taylorr: | I think your log my indicate that the readahead is exhausted |
[04:45:37] | PointyPumper (PointyPumper!~pintlezz@190.244.73.13) has joined #mythtv | |
[04:47:15] | taylorr: | markk: there it is – the rbwlock.lockForRead() right before the safe_read() call causes ReadPriv() to stall since it calls ReadBufAvail which also wants the rbwlock |
[04:48:16] | taylorr: | so even if we have plenty of buffer for ReadPriv() to read a safe_read() would cause us to stall |
[04:49:36] | taylorr: | that leaves a couple areas to improve, 1) why did it take 480 msec for safe_read() to complete and 2) should we be blocking in ReadBufAvail() if a safe_read is in progress |
[04:50:51] | taylorr: | I think safe_read should copy the data into a local buffer first and once it's got the data then get the lock and copy the buffer in the readahead buffer |
[04:52:20] | taylorr: | markk: I wonder if it's safe to remove the rbwlock before the safe_read() gets called in the readahead loop |
[04:52:47] | markk: | taylorr: only one way to find out... |
[04:54:35] | taylorr: | also wonder how hard it would be to change safe_read() to copy to a local buffer and then acquire the lock to move the data to the readahead buffer |
[04:55:00] | taylorr: | that way the lock isn't held up waiting for the backend to respond |
[04:55:24] | taylorr: | having a lock around safe_read() totally defeats the purpose of the readahead buffer :) |
[04:58:21] | markk: | taylorr: I think I agree. though for some reason I can't reproduce here at the momnet... |
[04:59:59] | taylorr: | markk: the huge safe_read() might still be an issue because if the matroska_cluster_read wants 32 MB of data at a time and our buffer is only 4 MB and a safe_read() gets called that takes 480 msec we are still in trouble |
[05:00:45] | taylorr: | so the first thing that needs to be resolved is why it takes 480 msec to get ~1MB of data |
[05:02:36] | taylorr: | wait it got 557KB not even 1MB for the 480 msec stall |
[05:03:32] | taylorr: | that's a whopping 8Mb/s data rate |
[05:08:09] | markk: | taylorr: it really has stopped pausing – can't get it to show the issue anymore... |
[05:08:32] | taylorr: | I've noticed it can be transient |
[05:09:29] | taylorr: | take a look at libmythdb/remotefile.cpp and the Read() function – it increments the socket wait every iteration in the loop upto 200msec |
[05:15:56] | pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has joined #mythtv | |
[05:20:34] | taylorr: | markk: you should commit the extra ",took %1 ms" debug output |
[05:20:41] | taylorr: | that's very useful |
[05:36:24] | taylorr: | markk: so for Cars it's definitely safe_read() blocking ReadPriv() that is causing the playback pauses |
[05:37:24] | taylorr: | I think the first thing that should be done is to not block ReadPriv() while waiting for a network request to complete |
[05:38:36] | taylorr: | and then we need to figure why safe_read() takes an obscene amount of time occasionally |
[05:38:53] | andreax (andreax!~andreaz@p57B92843.dip.t-dialin.net) has joined #mythtv | |
[05:38:54] | taylorr: | wonder if ReadPriv() could block safe_read() :) |
[05:47:39] | andreax (andreax!~andreaz@p57B92843.dip.t-dialin.net) has quit (Read error: Connection reset by peer) | |
[05:50:32] | taylorr: | markk: even removing some of the locks it appears safe_read() is still blocking ReadPriv() – I dunno anymore |
[05:50:37] | taylorr: | time for sleep |
[06:15:58] | martin_ (martin_!~quassel@static-88.131.29.2.addr.tdcsong.se) has joined #mythtv | |
[06:18:01] | Andy50 (Andy50!andy50@173-23-19-191.client.mchsi.com) has joined #mythtv | |
[06:18:24] | Andy5O (Andy5O!andy50@173-23-19-191.client.mchsi.com) has quit (Read error: Connection reset by peer) | |
[06:29:49] | gandalfcome (gandalfcome!~gandalfco@mithrandir.anu.edu.au) has joined #mythtv | |
[06:46:59] | natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has joined #mythtv | |
[06:53:55] | natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has quit (Ping timeout: 252 seconds) | |
[08:16:24] | eharris (eharris!~eharris@99-179-7-82.lightspeed.austtx.sbcglobal.net) has quit (Ping timeout: 255 seconds) | |
[08:17:53] | eharris (eharris!~eharris@99-179-7-82.lightspeed.austtx.sbcglobal.net) has joined #mythtv | |
[08:54:09] | stuartm_ (stuartm_!~stuartm@cpc4-derb9-0-0-cust534.8-3.cable.virginmedia.com) has joined #mythtv | |
[08:54:09] | stuartm_ (stuartm_!~stuartm@cpc4-derb9-0-0-cust534.8-3.cable.virginmedia.com) has quit (Changing host) | |
[08:54:09] | stuartm_ (stuartm_!~stuartm@mythtv/developer/stuartm) has joined #mythtv | |
[08:54:39] | stuartm (stuartm!~stuartm@mythtv/developer/stuartm) has quit (Ping timeout: 255 seconds) | |
[09:40:04] | stuartm_ is now known as stuartm | |
[10:05:03] | Guest20800 (Guest20800!~mike@c-24-21-63-118.hsd1.or.comcast.net) has quit (Remote host closed the connection) | |
[10:05:56] | mike|3 (mike|3!~mike@c-24-21-63-118.hsd1.or.comcast.net) has joined #mythtv | |
[10:14:37] | pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has quit (Quit: Leaving.) | |
[11:47:03] | gigem (gigem!~david@mythtv/developer/gigem) has quit (Remote host closed the connection) | |
[11:47:28] | gigem (gigem!~david@host103.16.intrusion.com) has joined #mythtv | |
[11:47:28] | gigem (gigem!~david@host103.16.intrusion.com) has quit (Changing host) | |
[11:47:28] | gigem (gigem!~david@mythtv/developer/gigem) has joined #mythtv | |
[11:55:24] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[11:55:46] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[12:03:25] | stoffel (stoffel!~quassel@p57B4B0A6.dip.t-dialin.net) has joined #mythtv | |
[12:31:34] | aloril (aloril!~aloril@dsl-tkubrasgw3-fe7ef900-153.dhcp.inet.fi) has quit (Ping timeout: 246 seconds) | |
[12:44:58] | aloril (aloril!~aloril@dsl-tkubrasgw3-fe7ef900-153.dhcp.inet.fi) has joined #mythtv | |
[13:00:46] | gandalfcome (gandalfcome!~gandalfco@mithrandir.anu.edu.au) has quit (Ping timeout: 240 seconds) | |
[13:05:15] | danielk22 (danielk22!~danielk@96.57.9.142) has left #mythtv () | |
[13:15:55] | Jordack (Jordack!~jordack@h69-131-44-221.mdsnwi.tisp.static.tds.net) has joined #mythtv | |
[13:28:35] | okolsi (okolsi!~mythtv@unaffiliated/okolsi) has joined #mythtv | |
[13:33:40] | okolsi: | there seems to be something wrong with the sockets during UPnP M-SEARCH response sending (IPv6 stuff?): http://pastebin.com/HiJJt045 |
[13:42:49] | stuarta: | where do you get the idea that is ipv6 related? |
[13:56:09] | digitlman (digitlman!463e6fd2@gateway/web/freenode/ip.70.62.111.210) has joined #mythtv | |
[13:56:15] | digitlman (digitlman!463e6fd2@gateway/web/freenode/ip.70.62.111.210) has left #mythtv () | |
[14:00:32] | okolsi: | stuarta: this used to work and there has been those ipv6 changes that broke stuff elsewhere.. just a guess, could be anything :) |
[14:00:47] | stuarta: | i blame the hampster |
[14:00:51] | stuarta: | :) |
[14:04:55] | markk: | okolsi: ssdp responses have been broken since the ipv6 changes. I tried to ping GreyFoxx a week or so ago without any luck |
[14:05:03] | kormoc (kormoc!~kormoc@mythtv/developer/kormoc) has quit (Read error: Operation timed out) | |
[14:05:50] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has quit (Ping timeout: 260 seconds) | |
[14:07:44] | kormoc (kormoc!~kormoc@mythtv/developer/kormoc) has joined #mythtv | |
[14:08:01] | GreyFoxx: | Oh? I didn't see your message |
[14:08:44] | GreyFoxx: | tell me how I can test it and I'm sure I can fix it easily enough (other than myth boxes I have no upnp capable stuff right now) |
[14:08:51] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has joined #mythtv | |
[14:08:58] | GreyFoxx: | Orwill backend discovery be enough ? |
[14:09:19] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[14:09:55] | martin_ (martin_!~quassel@static-88.131.29.2.addr.tdcsong.se) has quit (Remote host closed the connection) | |
[14:17:47] | markk: | GreyFoxx: the UPnP/SSDP code isn't receiving/responding to search requests coming in from other devices (fairly sure it's the receive bit that is broken – as there is nothing in the upnp logs when the search requests go out) |
[14:19:01] | okolsi: | markk, GreyFoxx: I did some basic debugging and good looking data is written to MSocketDevice with good looking IP and port etc., but then like those logs show, the socket is somehow invalid |
[14:19:24] | GreyFoxx: | Oko: What are you using to generate the errors? |
[14:19:28] | okolsi: | and nothing gets send back to the other device (verified with Wireshark) |
[14:19:44] | okolsi: | GreyFoxx: mobile phone with two different apps |
[14:19:48] | GreyFoxx: | ahhh |
[14:19:58] | GreyFoxx: | Android apps? |
[14:20:02] | GreyFoxx: | If so I can test that |
[14:20:22] | okolsi: | N900, it's own media player and Zhaan which is a UPnP control point app |
[14:20:31] | GreyFoxx: | ahhh ok |
[14:20:38] | GreyFoxx: | I bet djmount would do it |
[14:20:53] | GreyFoxx: | once my compile of a newer QT is finished up I'll try it out |
[14:21:21] | taylorr: | markk: ok, I put some more debug in and the ReadPriv() calls stop because the readahead buffer is completely drained (safe_read() doesn't block ReadPriv()) |
[14:21:44] | taylorr: | but safe_read() taking so long allows the buffer to drain |
[14:22:03] | taylorr: | I just had a read of 900KB of data take 720 msec |
[14:22:29] | iamlindoro: | youch |
[14:22:32] | taylorr: | to be clear, that read is the REQUEST_BLOCK to the backend initiated by safe_read |
[14:22:48] | taylorr: | so the problem is definitely on the backend |
[14:23:57] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has quit (Read error: Operation timed out) | |
[14:24:37] | markk: | taylorr: I was just about to write you a detailed mail. yes – your clip doesn't ask for data for 16 seconds, then tries to read about 5Mb very quickly (i.e. more than the buffer). So the ringbuffer waits on data from the readahead. If I have a decent gigabit ethernet connection, it returns data at about 750Mbps – and manages to read fast enough to get past the problem. |
[14:25:24] | markk: | If I drop down to wifi, reads come in at about 15Mbps which isnt't fast enough, and we get a pause. |
[14:25:49] | taylorr: | markk: I've got a gigabit LAN and the safe_reads() are still taking too long |
[14:25:59] | taylorr: | so it's not just the link speed at fault |
[14:26:38] | taylorr: | I'll run iperf between my machines and make sure the link is good |
[14:26:59] | markk: | If I force it to use StreamingRinbuffer with readahead thread – the reads are instantaneous over wifi – so the ffmpeg avio code is buffering more than us:) same probably applies for nfs. |
[14:27:58] | markk: | taylorr: I just added a quick and dirty Mbps output to the safe_read debug output (i.e. work back from bytes read and time to do so) (and forced that to output all of the time) |
[14:28:36] | markk: | oh – and another thing. My ion refuses to connect at gigabit speeds – which I only realised recently. |
[14:28:42] | taylorr: | markk: feel free to check in debug like that, I find without it the normal messages are not sufficient |
[14:28:51] | stoffel (stoffel!~quassel@p57B4B0A6.dip.t-dialin.net) has quit (Read error: Operation timed out) | |
[14:29:02] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has joined #mythtv | |
[14:29:17] | taylorr: | I just ran some iperf tests and I get 620 Mbits/sec between the host and client machines |
[14:30:47] | markk: | taylorr: I've not checked the other clip in detail yet, but I think the amount asked for is so high that it actually forces the backend readahead thread to read as well (and reading from disk is going to be much slower than reading from memory) |
[14:31:34] | taylorr: | markk: yes that one is even more bursty – I saw it read ~32MB in one cluster read |
[14:33:26] | markk: | so todays theory, now that the drugs have warn off, is that we really do need more buffering, but I think we should add multipliers based on known facts. i.e. this is a remote file, the connection speed is slow, it's a matroska file with unknown bitrate, it's h264 variable bitrate... |
[14:36:52] | taylorr: | markk: whoa! I just looked at some backend debug and the ReadPriv() is using ReadDirect and not the readahead thread |
[14:38:13] | GreyFoxx: | Hrm, so I see nothing logged by the BE at all when a M-SEARCH is broadcast out, but I do see it logging when a machine sends a NOTIFY , all of which are using 2399.255.255.250:1900 which implies that the BE is getting the stuff but not reacting to it |
[14:40:40] | j-rod|afk is now known as j-rod | |
[14:40:41] | taylorr: | markk: here's my theory – the backend isn't using the readahead buffer and the reads are far enough spaced that maybe the drive is idling or something |
[14:42:49] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has quit (Ping timeout: 260 seconds) | |
[14:47:30] | taylorr: | Captain_Murdoch: it appears that the backend isn't using the readahead buffer when streaming to a frontend – any ideas? |
[14:47:55] | taylorr: | it's performing a ReadDirect() everytime the frontend asks for data |
[14:48:57] | sailerboy (sailerboy!~sailerboy@ipv61.sailerboy.net) has joined #mythtv | |
[14:49:22] | wagnerrp: | GreyFoxx: you see the comment on the -users ML where a linux user is not listening on IPv4? |
[14:49:50] | GreyFoxx: | No, I don't read much of -users anymore. What is the subject/user ? |
[14:50:21] | okolsi: | GreyFoxx: add "socket,extra" options to backend logging, should see those invalid socket messages then |
[14:50:40] | wagnerrp: | http://www.mythtv.org/pipermail/mythtv-users/ . . . /316137.html |
[14:54:11] | GreyFoxx: | wag: Thanks |
[14:55:50] | GreyFoxx: | I can't imagine what would possibly cause that :) |
[14:56:11] | wagnerrp: | (well im getting it under freebsd) |
[14:56:32] | wagnerrp: | but beirdo said that was a known issue with how freebsd handles network connectivity |
[14:56:42] | markk: | taylorr: that personal spam makes sure the backend readahead thread starts |
[14:56:42] | GreyFoxx: | I'll fire up a virtual freebsd and get it working on that |
[14:57:52] | wagnerrp: | i imagine the few of us actually using freebsd can work around it on our own |
[14:57:59] | wagnerrp: | im more worried about people using osx |
[15:02:39] | abqjp (abqjp!~abqjp@97-119-171-42.albq.qwest.net) has joined #mythtv | |
[15:03:17] | GreyFoxx: | Yeah |
[15:03:30] | GreyFoxx: | unfortuntely my macmini has an old OSX |
[15:04:26] | GreyFoxx: | oko: Ok, I can reproduce it |
[15:04:41] | GreyFoxx: | once my QT compile finishes I'll be able to poke around |
[15:05:19] | mag0o: | fyi, virtualbox now has nice osx support, with just a simple edit to the xml for the vm, assuming that you have a valid osx installation media |
[15:05:28] | GreyFoxx: | oh? |
[15:05:32] | GreyFoxx: | What is the edit? |
[15:05:39] | mag0o: | one sec, let me find the link |
[15:06:15] | mag0o: | http://www.sysprobs.com/install-mac-os-1065-s . . . issues-fixed |
[15:06:24] | mag0o: | step 4 mainly |
[15:08:53] | markk: | taylorr: presumably if the readahead is working on the backend, then it's reading ahead of the frontend readahead :) |
[15:09:34] | taylorr: | markk: gotta run to lunch, I'll try your patch when I return |
[15:11:09] | pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has joined #mythtv | |
[15:11:26] | Goga777 (Goga777!~Goga777@shpd-95-53-212-70.vologda.ru) has joined #mythtv | |
[15:12:04] | markk: | taylorr: and I'll be back in the morning |
[15:17:39] | GreyFoxx: | mago: Thanks |
[15:49:01] | ** GreyFoxx tries to chear on his QT build so he can test some of his fixes ** | |
[15:49:07] | GreyFoxx: | cheer even |
[15:56:51] | natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has joined #mythtv | |
[16:25:40] | cocoa117 (cocoa117!~cocoa117@wk-28-147.guest.rdg.ac.uk) has joined #mythtv | |
[16:41:10] | SteveGoodey (SteveGoodey!~steve@host86-160-41-16.range86-160.btcentralplus.com) has joined #mythtv | |
[16:45:16] | andreax (andreax!~andreaz@p57B92843.dip.t-dialin.net) has joined #mythtv | |
[16:46:52] | gigem (gigem!~david@mythtv/developer/gigem) has quit (Remote host closed the connection) | |
[16:47:17] | gigem (gigem!~david@host103.16.intrusion.com) has joined #mythtv | |
[16:47:17] | gigem (gigem!~david@host103.16.intrusion.com) has quit (Changing host) | |
[16:47:17] | gigem (gigem!~david@mythtv/developer/gigem) has joined #mythtv | |
[16:54:42] | SteveGoodey (SteveGoodey!~steve@host86-160-41-16.range86-160.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[16:57:48] | cocoa117 (cocoa117!~cocoa117@wk-28-147.guest.rdg.ac.uk) has quit (Quit: Leaving) | |
[16:58:16] | SteveGoodey (SteveGoodey!~steve@host109-158-143-184.range109-158.btcentralplus.com) has joined #mythtv | |
[16:58:28] | stoffel (stoffel!~quassel@p57B4B0A6.dip.t-dialin.net) has joined #mythtv | |
[17:06:50] | SteveGoodey (SteveGoodey!~steve@host109-158-143-184.range109-158.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[17:06:53] | Steve_Goodey (Steve_Goodey!~steve@host86-160-43-62.range86-160.btcentralplus.com) has joined #mythtv | |
[17:16:51] | kep (kep!~keke@77.109.206.88) has joined #mythtv | |
[17:17:23] | Goga777 (Goga777!~Goga777@shpd-95-53-212-70.vologda.ru) has quit (Ping timeout: 276 seconds) | |
[17:18:23] | Goga777 (Goga777!~Goga777@shpd-95-53-212-70.vologda.ru) has joined #mythtv | |
[17:30:13] | taylorr: | is it reasonable for a lock or unlock to take 10–12 msec when nothing else is using the lock? |
[17:38:08] | Beirdo: | seems a bit long to me |
[17:38:33] | Beirdo: | a QMutex lock/unlock? |
[17:39:10] | Beirdo: | I've never timed it, but I'd expect < 1ms for a mutex operation that's not blocking |
[17:41:34] | taylorr: | Beirdo: check out ringbuffer.cpp and the ReadPriv() function... it uses the rwlock... I don't see anywhere else these locks could be used when reading directly from a file and the readahead buffer not running |
[17:43:09] | Beirdo: | there might be more of an overhead on a rwlock, but still, seems a bit long |
[17:47:12] | taylorr: | Beirdo: any idea how to debug rwlock.unlock taking too long – how can you determine if there is a lock in use? is there a function that gives the state of the lock? |
[17:47:48] | SteveGoodey (SteveGoodey!~steve@host86-145-236-7.range86-145.btcentralplus.com) has joined #mythtv | |
[17:48:19] | Beirdo: | you can do tryLockForRead or tryLockForWrite, which return false if something else had the lock |
[17:48:26] | Steve_Goodey (Steve_Goodey!~steve@host86-160-43-62.range86-160.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[17:48:37] | Beirdo: | and if nothing else does, gives you the lock, IIRC |
[17:49:31] | Beirdo: | this is where threaded software becomes a supreme bitch |
[17:50:00] | Beirdo: | when you get interthread timing issues, etc. It can be very difficult to debug |
[17:50:40] | Beirdo: | I'd love to see Solaris' dtrace around for Linux so we could use it |
[17:51:37] | Beirdo: | hmm, wikipedia says it has been ported to Linux. interestign |
[17:52:03] | taylorr: | Beirdo: so I got rid of the locks and the delays are still there between the VERBOSE calls – basically I put a VERBOSE call before and after each lock call and after commenting out the lock call I still see the delay between the VERBOSE messages |
[17:52:17] | taylorr: | so it's not a lock issue |
[17:52:25] | Beirdo: | K. |
[17:52:34] | taylorr: | I assume – why would VERBOSE calls take that long? |
[17:52:44] | Beirdo: | it also has a mutex involved |
[17:52:56] | Beirdo: | part of what the new-logging branch is aiming to fix |
[17:53:25] | Beirdo: | and also, it has to actually log it (to disk/screen/both) before it returns |
[17:53:46] | taylorr: | ah, I noticed that if you run the backend with -v file,network,extra responses get extremely slow |
[17:54:18] | taylorr: | ok, well it looks like to debug this I'm making the situation worse :) |
[17:54:19] | Beirdo: | yeah, thankfully, you likely aren't using a 9600 baud console, that would make it even slower :) |
[17:54:54] | Beirdo: | you could try on the new-logging branch if you wanted to |
[17:55:26] | Beirdo: | I need to get the db changes committed for that so we can merge at some point |
[17:55:29] | taylorr: | Beirdo: nothing else is printing but the stuff I added so I wonder how a mutex could be colliding with my VERBOSE calls |
[17:56:01] | taylorr: | I have a TON of VERBOSE calls for playback debug on the frontend and never have had an issue |
[18:00:05] | Beirdo: | yeah, dunno :) That may still be a red herring, but VERBOSE does use a mutex itself, so, it could be somehow related there |
[18:00:33] | fp_ (fp_!~fp@adsl-71-145-161-95.dsl.austtx.sbcglobal.net) has joined #mythtv | |
[18:02:19] | taylorr: | Beirdo: it seems like another thread might be interrupting the ReadPriv() call too ofter and for too long |
[18:03:27] | Beirdo: | hmmm. |
[18:03:29] | taylorr: | the actual read() call to the disk only takes around 10 msec but the overall time to complete the network request can take 100–500 msec |
[18:03:47] | taylorr: | it's like it just sits around between calls and does nothing |
[18:04:20] | Beirdo: | weird |
[18:05:10] | taylorr: | even removing the VERBOSE calls I still see the same amount of delay |
[18:05:24] | Beirdo: | K, so at least we know it's not that. |
[18:05:26] | taylorr: | so the VERBOSE and lock calls were just red herrings |
[18:06:35] | taylorr: | anyway to give the thread that calls ReadPriv a higher priority? |
[18:07:51] | stoffel (stoffel!~quassel@p57B4B0A6.dip.t-dialin.net) has quit (Ping timeout: 252 seconds) | |
[18:08:10] | Beirdo: | yes, QThreads lets you change priority |
[18:08:57] | taylorr: | any idea where the thread is created for ReadPriv gettings called for a network REQUEST_BLOCK? |
[18:09:32] | Beirdo: | Hmm, not off hand. I could check though |
[18:09:51] | Beirdo: | OK, have to run file,extra? |
[18:09:57] | Beirdo: | that's on the frontend, right? |
[18:10:11] | Beirdo: | or on the backend |
[18:10:43] | taylorr: | I'm looking at the backend logs |
[18:10:59] | Beirdo: | oh, I'm gonna have a hard time making it start playing remotely :) |
[18:11:01] | taylorr: | when streaming to a frontend |
[18:11:03] | Beirdo: | heh |
[18:11:08] | Beirdo: | OK, one sec |
[18:12:27] | Beirdo: | OK, restarting the backend with -v system,file,extra |
[18:12:47] | Beirdo: | not sure if I can get it playing remotely right now, but I can try |
[18:13:44] | mrand (mrand!~mrand@ubuntu/member/mrand) has quit (Ping timeout: 260 seconds) | |
[18:14:02] | Beirdo: | hehe, we broke the remote control in mythweb, it seems |
[18:14:23] | GreyFoxx: | that just uses the Network/telnet interface ? |
[18:14:26] | Beirdo: | at least the feedback |
[18:14:53] | Beirdo: | oooh, right, markk took out jpg encoding of the screen cap and we didn't make it use the png yet |
[18:15:45] | Beirdo: | I hit enter on the remote for my frontend, and a bunch of ringbuf stuff spewed in the backend logs. now to hit esc and look for what I was looking for |
[18:16:56] | Beirdo: | 2011-06–01 11:14:58.658636 [23368] ProcessRequest ringbuffer.cpp:1044 (ReadPriv) – RingBuf(/opt/mythtv/video3/default/3111_20110508170000.mpg): ReadPriv(..655360, normal) @0 — begin |
[18:16:56] | MythLogBot: | SVN 23368: (branch master) https://github.com/MythTV/mythtv/commit/f67241b6 |
[18:17:08] | Beirdo: | that's it hitting the log at the top of ReadPriv |
[18:17:17] | Beirdo: | so... it's in the ProcessRequest thread |
[18:19:05] | Beirdo: | which is in programs/mythbackend/mainserver.cpp |
[18:19:09] | Beirdo: | around line 124 |
[18:21:21] | taylorr: | Beirdo: any idea if there is something non-optimal on how it's setup? |
[18:21:35] | Goga777 (Goga777!~Goga777@shpd-95-53-212-70.vologda.ru) has quit (Ping timeout: 240 seconds) | |
[18:22:02] | Beirdo: | nothing strikes me off hand |
[18:24:17] | kormoc is now known as kormoc_afk | |
[18:24:31] | Beirdo: | ugh, other than the long line of inefficient string compares, but... |
[18:24:35] | Beirdo: | that's always been there |
[18:24:42] | Beirdo: | one thing that could help, I guess. |
[18:24:51] | taylorr: | Beirdo: how do you set the priority for the thread? |
[18:25:09] | taylorr: | where is the "long line of inefficient string compares"? |
[18:25:15] | Beirdo: | move the handling of REQUEST_BLOCK to the top of the list |
[18:25:27] | Beirdo: | void MainServer::ProcessRequestWork(MythSocket *sock) |
[18:26:04] | Beirdo: | how many compares before we find that particular string? |
[18:27:20] | taylorr: | QUERY_FILETRANSFER happens first it seems |
[18:27:31] | Beirdo: | we hit that string on line 4904 in my copy |
[18:27:46] | Beirdo: | that's not too bad |
[18:27:56] | Beirdo: | oh wait |
[18:28:06] | Beirdo: | that's in HandleFileTransferQuerey |
[18:28:22] | taylorr: | right which is called from QUERY_FILETRANSFER |
[18:28:30] | taylorr: | which is way down the list |
[18:28:32] | Beirdo: | which is considerably down the list |
[18:28:34] | Beirdo: | yeah |
[18:28:44] | taylorr: | but I can't see how that's causing an issue |
[18:28:54] | Beirdo: | not sure how much it might speed it up to put that higher in the list, but it will speed it iup |
[18:29:10] | Beirdo: | possibly not enough to matter though |
[18:29:21] | Beirdo: | oh, to set priority... sorry |
[18:29:27] | taylorr: | but that's to start the REQUEST_BLOCK process |
[18:29:44] | taylorr: | I'm seeing the issue inside of that command |
[18:29:51] | Beirdo: | Oooh |
[18:30:08] | Beirdo: | thread->setPriority(QThread::HighPriority) |
[18:30:20] | Beirdo: | or HighestPriority |
[18:30:21] | taylorr: | it eventually calls ReadPriv() in ringbuffer.cpp |
[18:31:10] | Beirdo: | so in MainServer's ctor... |
[18:31:32] | Beirdo: | prt->setPriority( ) before prt->start(); |
[18:31:43] | Beirdo: | fill in the priority ya want to try with |
[18:32:13] | Beirdo: | same thing in MainServer::readyRead() where it's increasing the size of the pool |
[18:33:38] | taylorr: | thanks, trying both now |
[18:34:19] | Beirdo: | it's worth a try :) |
[18:38:05] | Beirdo: | so we actually write back the number of bytes received in each block? |
[18:39:21] | taylorr: | I believe so |
[18:39:40] | taylorr: | the frontend request x number of bytes and the backend returns that amount |
[18:39:45] | taylorr: | it didn't help |
[18:39:58] | Beirdo: | I wonder if we aren't hitting some sorta windowing issue? |
[18:40:09] | Beirdo: | like what happens on DSL |
[18:40:17] | taylorr: | what do you mean by "windowing"? |
[18:40:22] | taylorr: | this is backend |
[18:40:47] | Beirdo: | if you look in the FileTransfer::RequestBlock() method |
[18:41:06] | Beirdo: | we read a chunk (asking for a large amount, getting a smaller amount) |
[18:41:34] | Beirdo: | oh hell, never mind |
[18:41:43] | ** Beirdo needs another coffee ** | |
[18:41:52] | Beirdo: | duh, then we send that chunk out |
[18:41:54] | Beirdo: | never mind |
[18:42:18] | Beirdo: | I read it as we sent back the size, missed the "buf" |
[18:43:03] | taylorr: | like I say the safe_read of the local file descriptor takes a very short amount of time – but the rest of the code path seems to stall as if it has sleep calls everywhere |
[18:43:08] | Beirdo: | the only thing in that loop that might cause issues perhaps would be if GetStopReads() were really slow, or writeData were really nastily slow |
[18:43:17] | taylorr: | no idea what is happening during the lulls |
[18:43:25] | Beirdo: | right. |
[18:44:10] | Beirdo: | OK, no way GetStopReads can be slow :) |
[18:44:19] | Beirdo: | return stopreads; |
[18:44:20] | Beirdo: | heh |
[18:45:44] | taylorr: | right, the slowdown is in rbuffer->Read(); |
[18:48:34] | Beirdo: | that's odd. |
[18:48:36] | kormoc_afk is now known as kormoc | |
[18:48:58] | Beirdo: | you put logging around the various parts of the loop and narrowed it down to there? |
[18:49:06] | taylorr: | this is crazy strange though it's like the RingBuffer::ReadPriv() is ran in an UltraLowPriority thread or something |
[18:49:12] | taylorr: | yes |
[18:49:17] | Beirdo: | the loop in RequestBlock, I mean |
[18:49:18] | Beirdo: | OK |
[18:50:01] | taylorr: | no I didn't put anything in RequestBlock |
[18:50:28] | Beirdo: | might be worth it, in case the slowdown ends up being on the sock->writeData |
[18:50:44] | taylorr: | because the delays are in ReadPriv() -> ReadDirect() -> safe_read() |
[18:50:49] | Beirdo: | dunno why it would be though |
[18:50:52] | Beirdo: | right |
[18:51:19] | taylorr: | there maybe some slowdown in sock->writeData but there is significant slowdown in ReadPriv() |
[18:51:24] | taylorr: | for no apparent reason |
[18:51:52] | ** stuartm is not going to read scrollback today ** | |
[18:52:45] | taylorr: | good decision |
[18:53:15] | taylorr: | this problem is now above and beyond my understanding |
[18:53:30] | Steve_Goodey (Steve_Goodey!~steve@host86-144-1-89.range86-144.btcentralplus.com) has joined #mythtv | |
[18:53:52] | SteveGoodey (SteveGoodey!~steve@host86-145-236-7.range86-145.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[18:54:08] | taylorr: | Beirdo: thanks for the help |
[18:54:23] | Beirdo: | no problem. sorry I couldn't help more than I did |
[18:54:42] | stuartm: | taylorr: heh, there is just a lot of it and I'm feeling lazy :) |
[18:54:53] | taylorr: | that's my fault |
[18:55:14] | Beirdo: | oooh |
[18:55:22] | Beirdo: | fileringbuffer.cpp |
[18:55:34] | Beirdo: | take a look at the end of safe_read() |
[18:56:00] | taylorr: | it just shouldn't take the backend 100ms or more to respond with 1MB or less of data |
[18:56:01] | Beirdo: | I hope that delay only manages to kick in during livetv |
[18:56:18] | Beirdo: | if (tot < sz) |
[18:56:19] | taylorr: | yes, this is an "oldfile" so it should bypass those sleeps |
[18:56:26] | Beirdo: | usleep(60000); |
[18:56:30] | taylorr: | it breaks if oldfile |
[18:56:41] | taylorr: | I put in debug to make sure that didn't get hit |
[18:56:45] | Beirdo: | ah OK |
[18:56:49] | Beirdo: | whew |
[18:57:04] | taylorr: | and like I said safe_read() returns with the data in about 10 msec everytime |
[18:57:26] | Beirdo: | which sounds reasonable given enough data to read |
[18:58:01] | SteveGoodey (SteveGoodey!~steve@host86-145-234-240.range86-145.btcentralplus.com) has joined #mythtv | |
[18:58:05] | Beirdo: | why do my fingers keep wanting to type "ringbugger" |
[18:58:12] | Steve_Goodey (Steve_Goodey!~steve@host86-144-1-89.range86-144.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[19:06:26] | SteveGoodey (SteveGoodey!~steve@host86-145-234-240.range86-145.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[19:07:19] | SteveGoodey (SteveGoodey!~steve@host86-145-232-239.range86-145.btcentralplus.com) has joined #mythtv | |
[19:14:40] | SteveGoodey (SteveGoodey!~steve@host86-145-232-239.range86-145.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[19:18:19] | SteveGoodey (SteveGoodey!~steve@host109-158-208-201.range109-158.btcentralplus.com) has joined #mythtv | |
[19:30:14] | mrand (mrand!~mrand@ubuntu/member/mrand) has joined #mythtv | |
[19:41:04] | GreyFoxx: | Well, I see some of the problem with the upnp stuff, I have mine discoverable again |
[19:44:51] | Steve_Goodey (Steve_Goodey!~steve@host86-160-205-251.range86-160.btcentralplus.com) has joined #mythtv | |
[19:45:52] | SteveGoodey (SteveGoodey!~steve@host109-158-208-201.range109-158.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[19:45:56] | GreyFoxx: | Hrm, ok I can see all of my content now |
[19:47:08] | GreyFoxx: | Though the music stuff seems to need updates for changes to the mythmusic schemas |
[19:50:12] | Steve_Goodey (Steve_Goodey!~steve@host86-160-205-251.range86-160.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[19:54:46] | Steve_Goodey (Steve_Goodey!~steve@host86-145-237-65.range86-145.btcentralplus.com) has joined #mythtv | |
[20:01:58] | anykey_ (anykey_!~guedel@46-126-247-133.dynamic.hispeed.ch) has quit (Quit: leaving) | |
[20:03:20] | GreyFoxx: | Markk/okolsi: Please try this patch and let me know if it works for you: http://pastebin.ca/2073379 |
[20:08:04] | anykey_ (anykey_!~guedel@46-126-247-133.dynamic.hispeed.ch) has joined #mythtv | |
[20:19:06] | taylorr (taylorr!~taylorr@unaffiliated/elmojo) has quit (Quit: Leaving) | |
[20:21:15] | taylorr (taylorr!~taylorr@cpe-173-095-144-220.nc.res.rr.com) has joined #mythtv | |
[20:21:15] | taylorr (taylorr!~taylorr@unaffiliated/elmojo) has joined #mythtv | |
[20:21:15] | taylorr (taylorr!~taylorr@cpe-173-095-144-220.nc.res.rr.com) has quit (Changing host) | |
[20:21:39] | taylorr: | Beirdo: so the sock->writeData() only takes 10–11msec |
[20:22:22] | taylorr: | so the meat of the code to send data to the frontend only takes about 22–23 msec but we still see it take over 100 msec |
[20:22:39] | taylorr: | so we are spending anywhere from 100–400 msec doing nothing |
[20:23:12] | Beirdo: | there's gotta be a delay or something, but where?! |
[20:23:44] | taylorr: | Beirdo: it's not in the code path is what I don't understand |
[20:24:10] | kormoc: | it could be up the stack? |
[20:24:23] | Beirdo: | yeah, maybe there's another higher priority thread that's preempting it? |
[20:24:26] | taylorr: | with no code between VERBOSE calls we see it takes more than 10 msec which is crazy |
[20:24:37] | taylorr: | that's what I'm thinking |
[20:24:45] | Beirdo: | how many CPU cores/threads do you have on your test box? |
[20:24:54] | taylorr: | I added the HighestPriority attribute but it didn't help |
[20:25:06] | taylorr: | I have 2 cores with HT so 4 threads |
[20:25:25] | taylorr: | I don't see hardly any cpu getting used |
[20:25:32] | taylorr: | iostat seems pretty idle too |
[20:26:33] | Beirdo: | hmm |
[20:27:00] | taylorr: | I also ran powertop and it doesn't show anything suspect |
[20:27:14] | kormoc: | is the send waiting for the syscall writing to the socket to send the data or buffer it? |
[20:28:14] | taylorr: | kormoc: the delay for that is write to the socket but the delays are happening everywhere |
[20:28:20] | Steve_Goodey (Steve_Goodey!~steve@host86-145-237-65.range86-145.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[20:29:11] | Steve_Goodey (Steve_Goodey!~steve@host109-158-209-218.range109-158.btcentralplus.com) has joined #mythtv | |
[20:29:20] | kormoc: | taylorr, well, if it's context switching, you should be able to tell via /proc/pid/nonvoluntary_ctxt_switches increasing |
[20:29:45] | taylorr: | the irony is that the calls that should take long are faster than the code path doing nothing :) |
[20:29:51] | kormoc: | taylorr, whoops, it's /proc/pid/status for the nonvoluntary_ctxt_switches line |
[20:30:26] | taylorr: | no /proc/pid on this machine :) |
[20:30:33] | kormoc: | ahh |
[20:30:35] | taylorr: | kernel 2.6.32 |
[20:31:00] | kormoc: | so you ditched /proc entirely? I'd have thunk things would break |
[20:31:21] | Beirdo: | /proc/pid meaning to subsitute in the pid number, I would expect |
[20:31:30] | taylorr: | ah, duh |
[20:31:34] | kormoc: | yes |
[20:33:24] | taylorr: | the nonvoluntary_ctxt_switches isn't increasing any |
[20:33:41] | kormoc: | so then it's not another thread butting in |
[20:33:59] | taylorr: | could be another thread in the mythbackend? |
[20:34:24] | Steve_Goodey (Steve_Goodey!~steve@host109-158-209-218.range109-158.btcentralplus.com) has quit (Ping timeout: 250 seconds) | |
[20:34:39] | kormoc: | is voluntary_ctxt_switches increasing? |
[20:34:58] | taylorr: | I saw it increase once |
[20:35:32] | taylorr: | yes, very slowly |
[20:36:12] | kormoc: | I'm looking up what exactly counts as what just to be sure |
[20:38:00] | jpharvey (jpharvey!~jpharvey@host81-151-169-219.range81-151.btcentralplus.com) has quit (Ping timeout: 255 seconds) | |
[20:38:11] | kormoc: | yes, it's pid based, so it could be another mythbackend thread that wouldn't show up in there |
[20:47:50] | kc (kc!~Casper@unaffiliated/kc) has quit (Ping timeout: 260 seconds) | |
[20:49:41] | jpharvey (jpharvey!~jpharvey@host86-135-80-177.range86-135.btcentralplus.com) has joined #mythtv | |
[20:56:13] | jab (jab!~jab@firewall.adslpipe.co.uk) has joined #mythtv | |
[20:56:41] | Jordack (Jordack!~jordack@h69-131-44-221.mdsnwi.tisp.static.tds.net) has quit () | |
[21:07:58] | taylorr: | kormoc: when you show the threads (forgot which key) in top does it show the priority set by QT |
[21:09:50] | kormoc: | if qt is passing it down the stack, yes, it should |
[21:10:14] | kormoc: | which is the only way I'd know of QT being able to manage priority (passing it down to the kernel) |
[21:10:53] | fp_ (fp_!~fp@adsl-71-145-161-95.dsl.austtx.sbcglobal.net) has quit (Quit: Leaving) | |
[21:15:05] | taylorr: | hmm, I see one with a nice of 9 and the rest are 0 |
[21:15:26] | Beirdo: | that would be a lower priority, no? |
[21:15:45] | taylorr: | yes, not sure which thread it is though |
[21:15:57] | taylorr: | I changed to TimeCriticalPriority but it didn't help |
[21:16:03] | kormoc: | what about the PR column? |
[21:16:13] | kormoc: | (that's the priority column qt should be setting) |
[21:16:48] | taylorr: | they are all 20 |
[21:17:14] | taylorr: | 29 threads running :) |
[21:17:29] | taylorr: | maybe that's why it takes so long |
[21:18:32] | Beirdo: | most of them are blocked on mutexes, etc |
[21:19:31] | taylorr: | Beirdo: do we create a new thread for each network transfer? |
[21:19:51] | Beirdo: | we have a pool of them used for each request |
[21:20:02] | Beirdo: | each request will take one from the pool |
[21:20:11] | Beirdo: | Hmmm, wonder how big that pool is |
[21:20:23] | taylorr: | wait the PR is 29 for the one that has nice of 9 |
[21:21:30] | taylorr: | I found this interesting http://lists.qt.nokia.com/pipermail/qt-intere . . . /032349.html |
[21:22:09] | Beirdo: | it has 5 threads |
[21:22:18] | taylorr: | not sure what "Now I believe changing priority of a normal QThread, |
[21:22:18] | taylorr: | which runs with SCHED_OTHER under Linux, has no effect (both min and max priorities are 0)." means. |
[21:22:30] | Beirdo: | it's a MythDeque |
[21:22:43] | taylorr: | Beirdo: is it easy to print the PID of each one of the 5 threads when created? |
[21:24:43] | Beirdo: | not necessarily |
[21:25:16] | Beirdo: | let me go look, one sec |
[21:26:25] | Beirdo: | aha! |
[21:26:28] | kormoc: | the pid will all be the same, no? |
[21:26:29] | Beirdo: | gettid() |
[21:27:15] | taylorr: | each thread has a different under top |
[21:27:32] | Beirdo: | I *think* that might give you a PID that is the same as you could see in /proc or in top (if in the threaded mode) |
[21:27:47] | taylorr: | hit 'H' under top and it should show the threads with different PIDs |
[21:28:34] | Beirdo: | yeah, that might be using gettid(), but it's unclear |
[21:29:02] | Beirdo: | that syscall is for Linux only, so don't expect any joy out of OSX :) |
[21:29:29] | kormoc: | pidof mythbackend only shows one pid for me |
[21:30:16] | Beirdo: | as it should |
[21:30:20] | kormoc: | I have 37 mythbackend threads, one pid |
[21:30:51] | kormoc: | Beirdo, right. /proc/pid is per pid, each thread is under the same pid, so you can't get per thread stats out of /proc/pid |
[21:31:29] | kormoc: | Beirdo, you'd go into /proc/pid/tasks/ and look |
[21:31:56] | kormoc: | oh wow, I'm wrong! |
[21:32:01] | Beirdo: | heh |
[21:32:11] | kormoc: | /proc/pid/task/*/status has per thread ctxt_Switches |
[21:32:22] | kormoc: | so we can watch for those to be changing |
[21:32:38] | Beirdo: | in linux, the pid from getpid is the parent pid, but threads are actually lightweight processes |
[21:33:04] | taylorr: | Beirdo: how do you use the gettid() function? |
[21:33:12] | Beirdo: | it only puts the first pid into /proc |
[21:33:26] | kormoc: | ahh |
[21:33:28] | Beirdo: | gettid() |
[21:33:28] | kormoc: | fair 'nuff |
[21:33:45] | kormoc: | I really only know how that all works from /proc and friends. never really overly looked deep |
[21:33:46] | Beirdo: | it returns a pid_t (which is an int or a short int, forget which) |
[21:34:07] | Beirdo: | man gettid ;) |
[21:34:26] | taylorr: | Beirdo: I was talking about when I create the thread, or do I have to use it when inside the thread? |
[21:34:27] | Beirdo: | now whether that gives us what we want or not, dunno |
[21:34:32] | Beirdo: | ahh |
[21:34:34] | Beirdo: | inside |
[21:34:55] | kormoc: | watch grep ctxt_switches /proc/`pidof mythbackend`/task/*/status can give you per thread ctxt switch overview |
[21:35:26] | Beirdo: | kormoc: yeah under /proc/pid/task/ it lists all the pids that are threads, good find :) |
[21:35:44] | noahric (noahric!4a7d3b41@gateway/web/freenode/ip.74.125.59.65) has joined #mythtv | |
[21:35:47] | Beirdo: | the same ones that top shows if you hit H |
[21:35:49] | kormoc: | I knew it did that, I didn't know it kept stats on them, but it does, rather awesome |
[21:35:59] | Beirdo: | and hopefully the number that gettid() returns |
[21:36:04] | taylorr: | Beirdo: any linux specific functions akin to gettid which will change the priority/nice of the thread? |
[21:36:59] | Beirdo: | yes, but then you are messing with pthreads under Qt's wrappings, and it might do really odd stuff |
[21:38:16] | Beirdo: | you can always renice a thread's pid |
[21:40:15] | taylorr: | Beirdo: filetransfer.cpp:128: error: ‘gettid’ was not declared in this scope |
[21:40:37] | Beirdo: | need #include <sys/types.h> |
[21:40:39] | taylorr: | I added the define #include <sys/types.h> |
[21:40:43] | Beirdo: | K |
[21:40:52] | Beirdo: | try putting :: in front of it |
[21:41:16] | taylorr: | nope |
[21:41:18] | Beirdo: | so it knows that you aren't trying to pull it out of the class |
[21:41:39] | Beirdo: | argh |
[21:41:41] | Beirdo: | one sec |
[21:41:48] | Beirdo: | NOTES Glibc does not provide a wrapper for this system call; call it using syscall(2). |
[21:41:51] | Beirdo: | heh |
[21:42:07] | Beirdo: | man syscall |
[21:42:14] | Beirdo: | it gives a pretty example |
[21:42:23] | Beirdo: | tid = syscall(SYS_gettid); |
[21:42:35] | Beirdo: | how retarded |
[21:43:13] | Beirdo: | sorry, I missed that when reading it the first couple times |
[21:44:34] | Beirdo: | oooh, and you can strace the pid of a thread specifically |
[21:44:39] | taylorr: | thanks got it compiled |
[21:44:42] | Beirdo: | that could help at some point |
[21:45:00] | Beirdo: | I hope it gives the number we are looking for :) |
[21:50:50] | taylorr: | Beirdo: it worked and I was able to renice them |
[21:50:56] | taylorr: | let me see if it helps |
[21:50:57] | Beirdo: | coool :) |
[21:51:33] | taylorr: | renice didn't help |
[21:51:46] | taylorr: | when I renice to -19 it set PR to 1 |
[21:51:52] | taylorr: | I wonder if that is bad or not |
[21:51:57] | kormoc: | that's correct |
[21:52:23] | Beirdo: | the nice thing is, you can strace that particular thread too |
[21:52:38] | kormoc: | you can't affect PR directly. it's set to default at 20. Nice values apply to it, and then any real time settings/etc. 1 is as good as it gets without true real time |
[21:52:40] | taylorr: | looks like linux ignores Qt's attempt to set the thread to TimeCriticalPriority |
[21:52:46] | Beirdo: | it only shows system calls |
[21:53:01] | Beirdo: | you may need to be running as root to get that priority? |
[21:53:12] | taylorr: | good point |
[21:53:31] | kormoc: | strace -c will give you aggregated stats per syscall |
[21:55:19] | taylorr: | root@casper:/usr/local/src/mythtv/git/mythtv-0–24/mythtv/programs/mythback end# strace -c -p 12038 |
[21:55:19] | taylorr: | Process 12038 attached – interrupt to quit |
[21:55:19] | taylorr: | ^CProcess 12038 detached |
[21:55:26] | taylorr: | 0.00 0.000000 0 64 read |
[21:55:28] | taylorr: | 0.00 0.000000 0 448 stat |
[21:55:30] | taylorr: | 0.00 0.000000 0 147 ioctl |
[21:55:32] | taylorr: | 0.00 0.000000 0 115 select |
[21:55:34] | taylorr: | 0.00 0.000000 0 4 nanosleep |
[21:55:36] | taylorr: | 0.00 0.000000 0 32 gettid |
[21:55:38] | taylorr: | ------ ----------- ----------- --------- --------- ---------------- |
[21:55:40] | taylorr: | 100.00 0.240014 2397 24 total |
[21:55:51] | taylorr: | it has errors? what is that all about |
[21:56:02] | Beirdo: | did we miss the first line? |
[21:56:10] | Beirdo: | got a bunch of 0.0% |
[21:56:13] | taylorr: | no |
[21:56:18] | Beirdo: | adding up to 100% |
[21:56:41] | Beirdo: | I think the first line is a futex |
[21:56:49] | taylorr: | seems it spends most of it's time in futex |
[21:57:00] | Beirdo: | which is how most mutex stuff is implemented, and the errors are likely timeout |
[21:57:29] | Beirdo: | hmm, the nanosleep is interesting though |
[21:57:57] | Beirdo: | the futex line didn't make it in the paste, BTW |
[21:58:32] | taylorr: | 95.83 0.230014 200 1152 20 futex |
[21:58:48] | Beirdo: | ah. OK |
[21:59:18] | Beirdo: | 200usec/call. My backend says 179. (different thread, but still) |
[21:59:32] | Beirdo: | I'd bet the 20 errors are ETIMEDOUT |
[22:00:11] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection) | |
[22:00:13] | stuartm: | pastebin? |
[22:01:11] | stuartm: | GreyFoxx: in what way is the upnp stuff out of sync with the mythmusic schema? I can't remember the last time the schema changed |
[22:01:53] | Beirdo: | http://pastebin.com/2iMJjEE8 is my run (on the db logging thread, IIRC) |
[22:02:40] | Beirdo: | yeah, db logger thread (not in master) |
[22:03:17] | Beirdo: | it's about what I expected to see, as that thread does a sleep of 100ms, which gets translated to a futex call |
[22:03:21] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[22:04:49] | Beirdo: | running strace without the -c will show you the spew as it calls system calls, which can help see where the slow timing is |
[22:05:31] | Beirdo: | oooh with a -r could be useful in that case (shows relative timestamp) |
[22:05:47] | taylorr: | how do I redirect to a file so I can post it here? |
[22:06:30] | Beirdo: | -o strace.txt, i think |
[22:06:58] | Beirdo: | pick your own file location and name, of course |
[22:07:07] | taylorr: | duh, yes the -r shows the time it spends, cool |
[22:07:26] | Beirdo: | there's also a -t that shows wall time, it seems |
[22:07:45] | Beirdo: | although to the second, not too useful |
[22:08:26] | Beirdo: | the -r should be enough to see where the slowness is. it measures time from the beginning of the previous system call |
[22:09:06] | Beirdo: | -ttt gives wall time to microseconds |
[22:11:01] | taylorr: | http://mythtv.pastebin.ca/2073418 |
[22:12:09] | taylorr: | those FUTEX_WAIT_PRIVATE calls look like the problematic ones |
[22:12:56] | taylorr: | I guess it stat's /etc/localtime for the VERBOSE output |
[22:12:58] | Beirdo: | holy crap do we ever poke the timezone a lot |
[22:13:00] | Beirdo: | yeah |
[22:13:28] | taylorr: | need to figure out the FUTEX_WAIT_PRIVATE |
[22:13:46] | taylorr: | 11 msec a piece which is what I was seeing |
[22:13:55] | Beirdo: | well, in between futex calls is where the delay is |
[22:14:26] | j-rod is now known as j-rod|afk | |
[22:15:56] | taylorr: | Beirdo: any idea why that's getting called so much? |
[22:16:36] | Beirdo: | it's used as the basis of mutexes, sleep, usleep, and a few other things IIRC |
[22:17:18] | Beirdo: | the 4th argument is a timeout argument (struct timespec *) |
[22:18:02] | Beirdo: | so, around line 255, it looks like a mutex implementation |
[22:18:22] | Beirdo: | it's waiting on a mutex (or equivalent) and then 11ms later something wakes it |
[22:28:56] | Beirdo: | the FUTEX_CMP_REQUEUE_PRIVATE seems to be like a .wake() on a QWaitCondition |
[22:29:49] | Beirdo: | so I guess the question is... what exactly are we blocking on? |
[22:30:36] | taylorr: | could it be the locks? |
[22:30:44] | Beirdo: | most likely |
[22:30:45] | jab (jab!~jab@firewall.adslpipe.co.uk) has quit (Ping timeout: 252 seconds) | |
[22:30:57] | taylorr: | but when I removed them it didn't seem to help any |
[22:34:55] | Beirdo: | hmm, maybe an strace -r -T would be handy |
[22:35:46] | Beirdo: | shows not only the timing between each syscall start, but also shows the duration of each syscall |
[22:36:26] | Beirdo: | which could show if the futex itself is taking time (waiting on a lock or whatever), or if the delay is between the futex calls |
[22:36:28] | taylorr: | Beirdo: ok, recompiling with some of the locks removed |
[22:36:37] | Beirdo: | k |
[22:40:34] | taylorr: | Beirdo: http://mythtv.pastebin.ca/2073428 |
[22:41:24] | Beirdo: | OK |
[22:41:58] | Beirdo: | looking around line 115 (just before logging the sock->wr)... |
[22:42:18] | Beirdo: | it seems that the futex locks are actually what is using the time |
[22:42:41] | Beirdo: | not sure which locks those would be. |
[22:44:15] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Ping timeout: 246 seconds) | |
[22:44:37] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[22:47:42] | taylorr: | Beirdo: those are probably poslock in RingBuffer::Read() |
[22:48:20] | taylorr: | killing them to see if it helps |
[22:48:36] | Beirdo: | hmmm, OK, just gonna go take a look :) |
[22:49:32] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Ping timeout: 276 seconds) | |
[22:51:07] | zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv | |
[22:54:38] | taylorr: | Beirdo: http://mythtv.pastebin.ca/2073430 – with the poslock removed in RingBuffer::Read() |
[22:55:05] | taylorr: | yep it got rid of those WAITs |
[22:56:24] | taylorr: | there's a bunch right after gettid |
[22:58:09] | taylorr: | Beirdo: ok, lines 23–31 |
[22:58:41] | taylorr: | that's just tid = syscall(SYS_gettid); VERBOSE(VB_IMPORTANT, QString("Request thread tid = %1").arg(tid)); |
[22:58:50] | zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Ping timeout: 276 seconds) | |
[22:58:57] | taylorr: | all those WAITs are between gittid and the VERBOSE call |
[22:59:18] | taylorr: | so we spend almost 30 msec doing nothing |
[22:59:29] | Beirdo: | hmmm |
[22:59:33] | Beirdo: | one sec |
[22:59:43] | taylorr: | seems like these WAITs occur after stat(/etc/localtime) very often |
[23:00:34] | taylorr: | Beirdo: I gotta run, but will be back later tonight |
[23:00:40] | Beirdo: | OK |
[23:00:50] | taylorr: | thanks for the help Mr. Manpage :) |
[23:00:52] | Beirdo: | I'm wondering if that isn't actually IN Verbose :) |
[23:00:57] | Beirdo: | heh, any time :) |
[23:01:06] | Beirdo: | they are actually useful, sometimes :) |
[23:01:31] | taylorr: | Beirdo: quite possible and is the only explanation that I can think of |
[23:01:36] | taylorr: | maybe we need to check it out |
[23:02:19] | taylorr: | the other question is why do the WAITs always seem to take around 10–11 msecs? |
[23:02:24] | Beirdo: | yeah, I'll try to recreate your test case in new-logging as well |
[23:02:39] | Beirdo: | yeah, I'm not sure about that. that seems odd |
[23:04:17] | andreax (andreax!~andreaz@p57B92843.dip.t-dialin.net) has quit (Read error: Connection reset by peer) | |
[23:04:19] | taylorr: | Beirdo: I've always noticed that the backend with many verbose options made it extremely sluggish |
[23:04:40] | taylorr: | if the VERBOSE macro is causing that much delay then it's no wonder |
[23:04:55] | Beirdo: | yeah, I didn't realize it was THAT bad, but yeah |
[23:05:16] | natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has quit (Read error: Operation timed out) | |
[23:08:06] | taylorr: | Beirdo: it's so bad with logging turned on for the backend that the fanart loading can still be going at least 6 seconds into the beginning of playback |
[23:08:18] | Beirdo: | holy crap |
[23:08:34] | taylorr: | it seems to be a common problem for the mythbackend program |
[23:08:51] | taylorr: | locks and VERBOSE calls seem to take forever – even if the lock is free |
[23:09:04] | Beirdo: | I'll definitely do some timing comparisons with the new code |
[23:10:15] | kep (kep!~keke@77.109.206.88) has quit (Quit: leaving) | |
[23:13:11] | Beirdo: | slowness is not a good trait. |
[23:15:17] | GreyFoxx: | stuart: When I navigate the music stuff I see tons of SQL complaints about a missing column on the backend |
[23:16:06] | kc (kc!~Casper@unaffiliated/kc) has joined #mythtv | |
[23:17:16] | GreyFoxx: | stuartm: Tons of these http://pastebin.ca/2073437 |
[23:18:15] | Beirdo: | taylorr: where did you put gettid, and what verbose level were you testing at (So I can try as well)? |
[23:18:47] | noahric (noahric!4a7d3b41@gateway/web/freenode/ip.74.125.59.65) has quit (Quit: Page closed) | |
[23:22:48] | stuartm: | GreyFoxx: that's not a schema change, it's a mistake in the original query, should be album.album_name |
[23:23:49] | stuartm: | it even gets it right near the beginning of the query but wrong at the end in the 'order by' clause |
[23:24:14] | GreyFoxx: | hehe fun :) |
[23:24:26] | GreyFoxx: | I hadn't looked too closely :) |
[23:25:08] | stuartm: | I wonder how long it's been broken without anyone noticing or reporting the problem |
[23:25:18] | GreyFoxx: | I was just wondering that |
[23:27:47] | dagar_ (dagar_!~dagar@agar.ca) has quit (Read error: Operation timed out) | |
[23:29:52] | dagar (dagar!~dagar@agar.ca) has joined #mythtv | |
[23:32:57] | abqjp (abqjp!~abqjp@97-119-171-42.albq.qwest.net) has quit (Quit: abqjp) | |
[23:34:50] | GreyFoxx: | commited |
[23:34:52] | GreyFoxx: | yay |
[23:35:16] | GreyFoxx: | I should look through tickets though to see if it is in one of them |
[23:35:37] | markk: | GreyFoxx: search patch doesn't seem to be working :( |
[23:35:49] | GreyFoxx: | Hmmmm |
[23:36:23] | GreyFoxx: | Can you send me a log snippet of -v socket,upnp,extra when you trigger the search ? |
[23:36:48] | GreyFoxx: | and what are you using to access it ? |
[23:38:05] | fphillips (fphillips!~fp@adsl-71-145-161-95.dsl.austtx.sbcglobal.net) has joined #mythtv | |
[23:41:00] | markk: | logs shortly – it actually looks like some software/devices work and some don't. I'm guessing a [ ] wrapper on the ipv4 address... |
[23:41:08] | GreyFoxx: | yup me too |
[23:41:14] | GreyFoxx: | http://pastebin.ca/2073443 |
[23:41:19] | GreyFoxx: | I was about to put this up hehe |
[23:41:33] | GreyFoxx: | if that makes it work overall I'll fix it properly |
[23:41:44] | GreyFoxx: | Well, I'll fix it properly either way, I just wanna know if that does it :) |
[23:41:57] | GreyFoxx: | there is 1 or 2 other places that might need it as well which I'm checking now |
[23:42:45] | GreyFoxx: | it looks I applied the [] fix to upnptasknotify already so I'll do the same to tasksearch |
[23:46:05] | markk: | GreyFoxx: that is better – it now comes up on 2 out of 4 iphone apps I test with – but still nothing with the other 2. |
[23:46:42] | Dave123 (Dave123!~dave@cpe-74-74-200-106.rochester.res.rr.com) has quit (Quit: Leaving) | |
[23:46:50] | GreyFoxx: | I think I know what that might be |
[23:46:57] | GreyFoxx: | I'll give you another patch in just a sec |
[23:47:11] | markk: | frontend seems to pick it up as well now – as well GUPnP Universal Control Point |
[23:47:22] | markk: | presentation URL is ipv6 |
[23:48:15] | GreyFoxx: | yeah |
[23:48:50] | GreyFoxx: | http://pastebin.ca/2073447 |
[23:49:24] | GreyFoxx: | I wonder if there is any good upnp browsers for my android phone |
[23:49:30] | GreyFoxx: | right now all I have to test with is djmount |
[23:50:16] | GreyFoxx: | heh my WinXP virtual is picking up the BE again now heh |
IRC Logs collected by
BeirdoBot.
Please use the above link to report any bugs.