MythLogBot@irc.freenode.net :: #mythtv

Daily chat history

Current users (83):

aloril, Anssi, anykey_, brfransen, CaCtus491, Captain_Murdoch, cattelan, Chutt, clever, cocoa117, coling, Cougar, Czar_Away, damaltor, danielk22, David_Miller, dblain, dekarl, dinamic|screen, dlblog, eharris_, ElmerFudd, foxbuntu, ghoti, GreyFoxx, highzeth, idl0r, J-e-f-f-A, jams, jarle, jcarlos, joe____, joki, Jordack, jpabq, jstenback, jwh, jwhite, kc, knightr, kormoc, kurre2, kwmonroe, laga, mag0o, markcerv, MaverickTech, Mousey, mrec, MythBuild, MythLogBot, mzanetti, NightMonkey, peitolm, Peps, petefunk, pheld, poptix, purserj, rich0, rsiebert_, seld, Sharky112065, skd5aner, Slasher`, SmallR2002, sphery, stichnot_, sutula, tgm4883, toeb, tomimo, tris, Unhelpful, vallor, Vernon_at_work, wagnerrp, whoDat, xavierh, XDS2010, yb0t, _charly_, _Techie_-_AFK_
Monday, June 25th, 2012, 00:00 UTC
[00:00:20] Beirdo: although righ NOW the backend was at 66 for a moment.
[00:00:32] Beirdo: oooh. mythfilldatabase just finished
[00:00:40] Beirdo: it was rescheduling
[00:02:08] Beirdo: anyways, I should finish off getting the logserver to die on restart properly
[00:04:21] andreax (andreax!~andreaz@p54BF348C.dip.t-dialin.net) has quit (Read error: Connection reset by peer)
[00:26:42] taylorr (taylorr!~taylorr@unaffiliated/elmojo) has joined #mythtv
[00:30:22] stichnot: Beirdo: The key issue I'm seeing is growth of the memory footprint of mythbackend, as measured by the VIRT column of "top". When all 3 remote frontends are playing recordings, I observe the memory size grow by 1 MB every ~5 seconds. After a few days this really adds up. The high CPU use appears to be a result of malloc/free spending more and more effort scavenging memory from a fragmented heap.
[00:36:01] stichnot: If you're not seeing this behavior, it may be OS related (for me, 32-bit 2.6.32-27-generic #49-Ubuntu SMP). And also if you're not seeing it, I'll continue digging.
[00:36:14] taylorr (taylorr!~taylorr@unaffiliated/elmojo) has quit (Ping timeout: 272 seconds)
[00:38:28] taylorr (taylorr!~taylorr@unaffiliated/elmojo) has joined #mythtv
[00:45:04] Beirdo: I haven't seen either of these things, but I'll go relocate to the TV and watch carefully while I have playback :)
[00:45:31] Beirdo: I suspect it's either a glibc or a Qt difference
[00:45:53] Beirdo: doubtful the kernel itself has changed that behavior (that would be my thinking)
[00:46:26] Beirdo: and I'm runnin 2.6.34-rc1-jarod1
[00:46:32] Beirdo: and have been for 2 years now
[00:47:42] Beirdo: unless they changed the default kernel page size or some retarded thing
[00:49:09] Beirdo: I would expect that the image caching thing danielk changed in the last 24h is a far more likely candidate in my mind
[00:49:29] Beirdo: but that's more used in the frontend, I would have thought
[00:49:33] Beirdo: hmm, never mind
[00:49:48] Beirdo: well, let me go clean off a chair so I can relocate
[00:50:07] Peps (Peps!~MiNT@li186-230.members.linode.com) has quit (Ping timeout: 252 seconds)
[00:51:54] Peps (Peps!~MiNT@li186-230.members.linode.com) has joined #mythtv
[00:53:58] Beirdo: OK, it's recording one show right now, on the hdpvr
[00:54:08] Beirdo: 1% CPU, 355m resident
[00:54:50] Beirdo: playing back to the frontend
[00:55:20] Beirdo: 3% CPU, 362M resident
[00:58:11] Beirdo: it is growing though
[00:58:14] Beirdo: 380M
[00:59:50] danielk22: Just as another datapoint I'm using Ubuntu 12.04 64-bit 3.2.0-23-lowlatency. I see the high memory use, but not the high CPU use.
[01:01:21] danielk22: I might not notice the high CPU use much as my systems are mostly i5. Even my older core2 system is running paused with the broken OSD caching right now and usage is in the single digits.
[01:02:42] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection)
[01:03:33] Beirdo: I think the memory may be happening here slowly too
[01:03:53] Beirdo: 451M resident and growing
[01:03:53] danielk22: Beirdo: Fedora?
[01:04:06] danielk22: Beirdo: what is the virtual use?
[01:04:37] Beirdo: Ubuntu 10.04 with hand upgrades for a few packages like x264
[01:04:45] dekarl (dekarl!~dekarl@p4FCEE085.dip.t-dialin.net) has joined #mythtv
[01:04:50] Beirdo: 1220M and growing on virt
[01:05:11] Beirdo: 12% CPU
[01:05:12] zombor (zombor!~zombor_@65.29.231.135) has joined #mythtv
[01:05:12] zombor (zombor!~zombor_@65.29.231.135) has quit (Changing host)
[01:05:12] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[01:05:36] Beirdo: and it's logging lots of "Poll giving up 2" for my HDPVR right now. about avery 5s
[01:05:41] gregL (gregL!~greg@cpe-74-76-125-87.nycap.res.rr.com) has quit (Read error: Connection reset by peer)
[01:05:46] dekarl1 (dekarl1!~dekarl@p4FCEF123.dip.t-dialin.net) has quit (Ping timeout: 265 seconds)
[01:06:28] Beirdo: Oh, it's also recording on the 2250, both analog tuners
[01:07:32] danielk22: Beirdo: For me at least the memory use does not correlate with logging frequency. Even with the LOG macro disabled the virtual memory use is huge and the resident set grows.
[01:07:46] Beirdo: yeah, it doesn't seem to correlate to logging
[01:08:03] Beirdo: that was just a side note, in case something odd in HDPVR recording was related
[01:08:24] Beirdo: but yah, 14–16% is a lot higher than it used to be
[01:08:30] Beirdo: and 108% on a commflag
[01:08:36] Beirdo: which is normal
[01:09:16] Beirdo: so there might be some requests from commflag hitting the socket
[01:11:31] danielk22: Beirdo: http://stackoverflow.com/questions/5841896/0m . . . dsafe-manner
[01:11:58] Beirdo: we don't share sockets
[01:12:03] danielk22: ok
[01:12:18] Beirdo: the only threads using them are the logging thread and the logserver thread
[01:12:38] Beirdo: and they communicate over each end of a socket in the one place (logserver) where both run
[01:13:29] Beirdo: BTW, the logserver is not growing CPU or memory at all
[01:14:03] Beirdo: so I'm expecting we might find the issue in the reference counter stuff maybe?
[01:14:34] danielk22: Beirdo: Nope, only object leaked.
[01:14:56] Beirdo: hmm
[01:15:33] danielk22: The mythbackend event socket is never collected at app exit, but that shouldn't cause any problems like this.
[01:15:34] Beirdo: I guess we COULD run a valgrind run with listing all known data chunks on exit
[01:15:58] Beirdo: would be a severe pain in the ass to read, and would be huge
[01:16:12] Beirdo: but might show the problem buried in the mess
[01:16:30] danielk22: Beirdo: I've already run valgrind, at app exit the memory is reaped.
[01:17:21] Beirdo: hmm
[01:17:22] danielk22: (well not all of it, but nearly all).
[01:18:49] Beirdo: I'm thinking of doing a run with --read-var-info=yes
[01:18:54] Beirdo: still looking ;)
[01:19:56] Beirdo: ah yes
[01:20:01] Beirdo: --show-reachable=yes
[01:21:19] Beirdo: could also run it as callgrind to profile
[01:22:45] gregL (gregL!~greg@cpe-74-76-125-87.nycap.res.rr.com) has joined #mythtv
[01:23:20] Beirdo: helgrind?
[01:23:28] Beirdo: never seen that one before.
[01:31:58] Beirdo: K, gonna do a local update in case that helps some.
[01:32:23] Beirdo: then I'll try to get the logserver exiting cleanly when another is running.
[01:32:49] Beirdo: I didn't get finished that on Friday before my now ex-coworkers took me out to get me drunk on my last day
[01:42:02] Beirdo: and after like 8 pints of good cider, I think I fulfilled that goal nicely
[01:44:06] Beirdo: ah jeez, I need more memory in my backend box
[01:44:22] danielk22: Beirdo: I did some stepping on "catch syscall mmap" and It looks like mythfrontend is up to 1GB by the time we're in MythContext::Init()
[01:44:40] Beirdo: mmap is used to pull in the .so files
[01:44:50] Beirdo: AFAIK
[01:45:00] Beirdo: not sure if that's what you're catching thohgh
[01:45:21] Beirdo: i.e. every libary we use... plus the frontend plugins
[01:45:22] danielk22: Right, I stepped though those and it got up to about 440 MB.
[01:45:38] Beirdo: what did? VIRT?
[01:45:46] danielk22: VIRT
[01:45:51] Beirdo: yeah, that's irrelevant
[01:46:03] Beirdo: what really matters isn't VIRT
[01:46:04] danielk22: So the first 440 MB is definately irrelevant.
[01:46:06] Beirdo: it's RES
[01:46:10] Beirdo: yeah
[01:46:19] Beirdo: after that, I guess it might be more relevant :)
[01:46:40] danielk22: Beirdo: Well RES isn't the only thing that matters, it's just that the VIRT which maps to libs doesn't matter so much.
[01:46:49] Beirdo: yup
[01:47:09] Beirdo: that's a better way to put it ;)
[01:47:16] danielk22: VIRT that maps to real memory does matter; it's just difficult to tease that out..
[01:47:31] Beirdo: now, endless growth of VIRT still does point to an issue
[01:48:29] danielk22: Right, glibc malloc uses mmap for allocations above a certain threshold, I was hoping maybe it was a big malloc we could identify.
[01:48:50] danielk22: But I think we'd need to use our own malloc to find it if that is the case.
[01:48:58] Beirdo: ahh
[01:49:10] Beirdo: pmap might be a bit useful too
[01:49:19] danielk22: too many mmaps to catch every one and take a trace..
[01:50:18] danielk22: Knowing that a big chunk is taken before MythContext::Init() might be useful though.
[01:50:41] Beirdo: < 0000000001853000 522484K rw--- [ anon ]
[01:50:42] Beirdo: ---
[01:50:42] Beirdo: > 0000000001853000 524144K rw--- [ anon ]
[01:50:47] Beirdo: it's heap growth
[01:51:09] Beirdo: that was the diff of two pmap runs about 15s apart
[01:51:25] Beirdo: now, how to get more specific... hmmm
[01:51:58] danielk22: BTW Another 1.2 GB allocated after MythContext::Init()..
[01:52:09] Beirdo: images for the frontend?
[01:52:24] Beirdo: you doing this on the frontend or backend? I forget
[01:52:47] danielk22: This is frontend. Images could be a big chunk of it.
[01:52:58] Beirdo: yeah, that would be my suspicion there
[01:53:13] Beirdo: 1080i images are fairly large :)
[01:53:27] danielk22: Prolly tracing the backend would be more useful.. ;]
[01:54:04] danielk22: Or the logserver, that should be simpler actually.. it's using about 2GB
[01:54:14] Beirdo: how the heck?!
[01:54:32] Beirdo: 488M virt / 23M RES on my backend
[01:54:33] Beirdo: heh
[01:55:12] danielk22: Strangely enough EACH one is using exactly 1928m
[01:55:12] Beirdo: DOH
[01:55:30] Beirdo: I forgot to check the RES/VIRT on the backend just before stopping playback
[01:55:31] Beirdo: crap
[01:55:35] Beirdo: hmm
[01:55:38] amejia (amejia!~andres@xbmc/staff/amejia) has quit (Quit: Konversation terminated!)
[01:55:44] Beirdo: some of that could be thread stacks
[01:56:14] Beirdo: 1524/563M before starting this playback
[01:56:45] Beirdo: many of the [anon] segments in pmap are just that. stacks for each thread (usually 8MB each)
[01:57:34] danielk22: I did a google on "zeromq memory footprint" each one should be about 8MB.. I can't imagine it's using that many thread stacks..
[01:58:03] Beirdo: well, all pthreads stacks default to 8MB
[01:58:33] danielk22: Right, but there are not that many threads in mythlogserver that it should be a huge issue..
[01:58:50] Beirdo: the logserver should have like 8M * 8 IO threads (I think it is)
[01:59:49] danielk22: 24 threads here.. so 192MB
[02:00:06] Beirdo: K, that could be.
[02:00:22] Beirdo: doubt that's the issue at hand
[02:00:42] Beirdo: running a pmap against that process might be a good plan for looking into though
[02:00:47] danielk22: 10 zmq threads
[02:02:12] Beirdo: IIRC, the first [anon] segment is the global heap
[02:02:29] danielk22: Beirdo: there are a large number of 64MB [anon] segments.
[02:03:16] Beirdo: Hmm, got a pastebin?
[02:03:44] Beirdo: could be your setup (or version of Qt) is using different sized default thread stack sizes
[02:03:51] Beirdo: although that shouldn't keep growing
[02:04:10] Beirdo: like 64MB instead of 8MB
[02:05:19] danielk22: 23 64MB anon's and 24 8MB ones
[02:05:31] danielk22: w / 24 threads
[02:05:44] Beirdo: this sounds like it's Qt :)
[02:05:46] FuzzyTheBear (FuzzyTheBear!~anonymous@bas2-montreal33-2925042839.dsl.bell.ca) has joined #mythtv
[02:06:00] FuzzyTheBear (FuzzyTheBear!~anonymous@bas2-montreal33-2925042839.dsl.bell.ca) has left #mythtv ("gone chase butterflies.")
[02:06:04] Beirdo: like maybe a 64MB heap per thread (other than main thread) perhaps?
[02:06:22] Beirdo: which would be pointless and silly
[02:06:29] danielk22: Could me, easy enough to check too..
[02:06:36] danielk22: s/me/be/
[02:06:37] Beirdo: Oh come ON you stupid HDPVR
[02:06:55] Beirdo: getting spammed with Poll giving up 2 again on this recording
[02:07:05] Beirdo: betcha it's garbage
[02:07:38] danielk22: Beirdo: Pretty sure it is poll giving up with an HDPVR means we try to reset the device...
[02:07:53] Beirdo: yeah
[02:08:01] Beirdo: which means the recording is hosed
[02:08:12] Beirdo: when it happens every 5s anyways
[02:08:25] danielk22: Yep, if you get one of those in a recording it might be watchable..
[02:09:03] Beirdo: well, I'm gonna kick it.
[02:10:17] Beirdo: there
[02:10:25] Beirdo: hopefully it wants to behave now
[02:14:52] danielk22: Beirdo: Could be thread local memory too..
[02:16:32] danielk22: Ok, it wasn't the stack size.
[02:17:40] danielk22: I lowered the stack size on MThread threads to 2MB and all but 10 of the 24 8 MB allocations fell to 2.4 MB
[02:18:01] danielk22: Presumably the other 10 8MB allocations are the 10 zeromq threads.
[02:18:27] danielk22: The 23 64 MB allocations are still there.
[02:19:01] danielk22: But it could very well be something thread related considering the count..
[02:19:44] Beirdo: K
[02:20:06] Beirdo: yeah, thread-local storage (heap, basically) would be my suspicion for those
[02:20:26] danielk22: It's possibly red-herring too. If these 64 MB allocations are thread-local storage they won't be growing...
[02:20:51] danielk22: gotta run pmap after a long run and compare to short run.
[02:21:39] danielk22: The increased _initial_ VIRT is probably just those extra 10 zeromq threads, and nothing to worry about.
[02:21:51] danielk22: It's the growth that matters.
[02:26:09] danielk22: Beirdo: how did you get the main heap size?
[02:28:22] Beirdo: sorry, was off in another window
[02:28:34] Beirdo: in pmap, the first [anon] segment is the main heap
[02:28:53] Beirdo: and yeah, it's the endless growth that's an issue
[02:30:00] Beirdo: on my box, it's fairly clear.. the main heap starts at a low address like 0000000001159000
[02:30:04] danielk22: Really? it shows at 7452K here for a backend that is 2.6GB VIRT and 1.2G RES
[02:30:21] Beirdo: and then the next segment is at 00007f500df08000
[02:30:57] Beirdo: 852952 here
[02:31:05] Beirdo: and the VIRT shows 1831M
[02:31:40] Beirdo: the way to be "sure"
[02:31:47] Beirdo: pmap the pid > file
[02:31:49] Beirdo: wait...
[02:31:52] Beirdo: do it again
[02:31:54] Beirdo: diff them
[02:31:55] Beirdo: :)
[02:32:19] Beirdo: I know it sounds obvious, but...
[02:34:39] Beirdo: If I remember right, the first 4 segments in pmap are: text, data, bss, heap
[02:34:46] Beirdo: it might be bss, data, heap
[02:35:18] Beirdo: either that or it's the global stack, but I don't think so.
[02:35:21] danielk22: looks like text is first, and heap is 4th.. it's just tiny here.
[02:35:33] Beirdo: yeah text is definitely first
[02:35:45] Beirdo: :) it's the order of bss and data I forget
[02:36:08] danielk22: Doesn't matter, one is 16K, the other 8K. :P
[02:36:17] Beirdo: exactly
[02:36:26] Beirdo: and are not gonna be growing
[02:40:28] danielk22: Hmm, looks like 7 8MB blocks were allocated when I did a record, and only 2 were freed.. This may be getting somewhere...
[02:40:47] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection)
[02:40:52] Beirdo: so it created 7 threads, and removed all but 2?
[02:41:13] danielk22: +7 -2 = +5
[02:41:30] Beirdo: I need to make a gdb script to attach, backtrace to a file
[02:41:36] Beirdo: ooh, I misread that
[02:41:37] danielk22: I don't think the thread count went up, but maybe we didn't collect them..
[02:41:54] danielk22: Anyway time for bed...
[02:42:03] Beirdo: night :)
[02:42:25] Beirdo: I hope to have the logserver exiting tidily on double start shortly
[02:42:32] danielk22: Could be something else too, sockets or something.. but in any case it grows.. stays at 7 for a bit then falls to 5 a bit later.
[02:43:01] Beirdo: something to investigate, for sure
[02:43:02] danielk22: Each 8MB anon is paired with a 4K one.
[02:43:23] Beirdo: yeah, it puts a 4k guard section likely with thread info in it
[02:43:34] Beirdo: I've never found out precisely why
[02:44:04] Beirdo: 87% cpu
[02:44:10] Beirdo: 1G res
[02:44:13] Beirdo: jeez
[02:51:27] Beirdo: CPU went from 97% to 3% when playback stopped
[02:51:42] Beirdo: so maybe something buried in the ring buffer?
[02:51:58] Beirdo: memory didn't decrease though
[02:53:29] Beirdo: I'll give it time to finish this recording, then restarting under valgrind
[03:04:39] Beirdo: OK, this will be fun
[03:12:11] zombor (zombor!~zombor_@65.29.231.135) has joined #mythtv
[03:12:11] zombor (zombor!~zombor_@65.29.231.135) has quit (Changing host)
[03:12:12] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[03:30:45] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection)
[03:30:55] cattelan is now known as cattelan_away
[03:33:58] Beirdo: crap
[03:34:11] Beirdo: I didn't have --leak-check=full turned on
[03:34:12] stichnot (stichnot!~chatzilla@mythtv/developer/stichnot) has quit (Ping timeout: 272 seconds)
[03:34:13] Beirdo: sigh
[03:34:16] Beirdo: let's try that again
[03:54:37] stichnot (stichnot!chatzilla@nat/intel/x-yutehzgpllghtxvv) has joined #mythtv
[03:54:37] stichnot (stichnot!chatzilla@mythtv/developer/stichnot) has joined #mythtv
[03:54:37] stichnot (stichnot!chatzilla@nat/intel/x-yutehzgpllghtxvv) has quit (Changing host)
[05:13:15] Czar_Away (Czar_Away!~Unknown@69.4.155.83) has joined #mythtv
[05:16:05] VManiac16 (VManiac16!~Unknown@69.4.155.83) has quit (Ping timeout: 248 seconds)
[05:52:57] pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has joined #mythtv
[06:02:11] natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has joined #mythtv
[06:11:38] natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has quit (Read error: Operation timed out)
[06:44:55] 15SAB4E0P (15SAB4E0P!~Goga777@128-71-247-145.broadband.corbina.ru) has joined #mythtv
[06:44:55] 17SABG1OF (17SABG1OF!~Goga777@128-71-247-145.broadband.corbina.ru) has joined #mythtv
[06:44:59] 17SABG1OF (17SABG1OF!~Goga777@128-71-247-145.broadband.corbina.ru) has quit (Read error: Connection reset by peer)
[06:58:44] pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has quit (Quit: Leaving.)
[07:05:49] kormoc (kormoc!~kormoc@mythtv/developer/kormoc) has quit (Quit: kormoc)
[07:06:23] Beirdo: OK, this is apparently related only to playback (for frontend and commflag)
[07:07:13] Beirdo: danielk22: you might want to look at any and all changes to the RingBuffer stuff in the affected time to make sure that there is no possibility of spin-loops under load, etc
[07:07:24] Beirdo: since you likely know that code the best :)
[07:08:26] Beirdo: as for the memory, may or may not be related
[07:08:41] Beirdo: I think we may have two separate buggerings going on here
[07:45:16] stichnot (stichnot!chatzilla@nat/intel/x-vpfvxnnahkpdcgcr) has joined #mythtv
[07:45:16] stichnot (stichnot!chatzilla@mythtv/developer/stichnot) has joined #mythtv
[07:45:16] stichnot (stichnot!chatzilla@nat/intel/x-vpfvxnnahkpdcgcr) has quit (Changing host)
[07:46:24] 15SAB4E0P (15SAB4E0P!~Goga777@128-71-247-145.broadband.corbina.ru) has quit (Remote host closed the connection)
[09:56:07] jya (jya!~jyavenard@mythtv/developer/jya) has quit (Quit: jya)
[10:01:35] map7_ (map7_!~map7@ppp118-209-105-45.lns20.mel4.internode.on.net) has joined #mythtv
[10:20:30] stuartm: here's something I've always wondered about, if you have an if/else if block with a large number of possibilities or even switch block with a lot of possible values, is there any performance benefit to moving the most likely or most frequent value to the top ?
[10:21:54] stuartm: e.g. in MainServer::ProcessRequest() where we spend a lot of time according to oprofile, would it be a good idea to move QUERY_FILETRANSFER right up to the top given that it's called almost constantly when streaming a recording?
[10:22:22] stuartm: rather than performing dozens of QString comparisons before reaching the one we want?
[10:26:03] stuartm: yes, that's mostly a rhetorical question
[10:44:30] peitolm: mostly?
[10:45:17] peitolm: cheap vs expensive, frequent vs rare, it's all a trade-off
[10:51:21] stuartm: right, hence mostly
[10:51:43] stuartm: in this case though that single change dropped backend CPU usage during streaming by at least 10%
[10:52:10] stuartm: and that's why I've pushed it
[10:52:19] stuartm: and will backport it momentarily
[10:53:06] stuartm: it's a bit of a 'well duh' moment, not sure why no-one considered it earlier
[10:53:23] Goga777 (Goga777!~Goga777@128-71-247-145.broadband.corbina.ru) has joined #mythtv
[10:55:48] stuartm: err, not 10%, it dropped at least 10 percentage points, more like 20 for HD material
[10:56:01] stuartm: i.e. high bitrate
[11:13:20] zombor (zombor!~zombor_@108-76-216-155.lightspeed.milwwi.sbcglobal.net) has joined #mythtv
[11:13:21] zombor (zombor!~zombor_@108-76-216-155.lightspeed.milwwi.sbcglobal.net) has quit (Changing host)
[11:13:21] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[11:15:10] kc (kc!~Casper@unaffiliated/kc) has joined #mythtv
[11:16:45] stuartm: fwiw, this doesn't fix the issue of high CPU in the backend generally, just something I noticed when debugging that issue
[11:18:03] dekarl: stuartm: test reordering is a standard optimization IMHO
[11:19:31] dekarl: for the switch block with many values you can do funny optimizations with perfect hashes and jump tables :)
[11:21:25] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Ping timeout: 248 seconds)
[11:24:36] stuartm: dekarl: aye, like I said, rhetorical :) I tend to ask questions like that when I'm looking at something so blindingly obvious that I feel sure I must be missing something and want someone else to point out my mistake :)
[11:28:05] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has joined #mythtv
[11:29:49] kc (kc!~Casper@unaffiliated/kc) has quit (Remote host closed the connection)
[11:31:16] kc (kc!~Casper@unaffiliated/kc) has joined #mythtv
[11:37:05] map7_ (map7_!~map7@ppp118-209-105-45.lns20.mel4.internode.on.net) has quit (Read error: Connection reset by peer)
[11:54:54] _Techie_-_AFK_ (_Techie_-_AFK_!~techie@165.33.69.111.dynamic.snap.net.nz) has quit (Ping timeout: 265 seconds)
[11:55:18] _Techie_-_AFK_ (_Techie_-_AFK_!~techie@203.86.207.153) has joined #mythtv
[12:01:44] _Techie_-_AFK_ (_Techie_-_AFK_!~techie@203.86.207.153) has quit (Ping timeout: 250 seconds)
[12:02:49] _Techie_-_AFK_ (_Techie_-_AFK_!~techie@165.33.69.111.dynamic.snap.net.nz) has joined #mythtv
[12:03:56] peitolm: stuartm: the only time you need to be wary is if the tests are done with the asumption that the previous tests have taken out stuff you don't want to see, so you may want to deal with 'bad stuff' before 'not so bad stuff' before dealing with 'possibly bad stuff'
[12:04:00] peitolm: if that make sense
[12:17:57] danielk22: stuartm: I've often wondered why C,C++ compilers don't use efficient switch implementations. Turbo Pascal switch statements were O(1) rather than the O(n) for C/C++ ones.
[12:20:37] stuartm: less of an issue with switch restricted to ints, than it might be with these large if/else if doing string comparisons
[12:21:11] danielk22: stuartm: QString comparisons shouldn't be so bad, most should fail on the length comparisons.
[12:21:41] danielk22: stuartm: Then presumably they use Boyer-Moore
[12:21:42] stuartm: danielk22: bad enough if doing dozens in a tight loop
[12:22:37] stuartm: made enough of a difference re-ordering the checks to be worth doing, but there probably aren't many other places in the code where that would be true
[12:23:06] danielk22: stuartm: But each one of those corresponds to doing a lot of work. Without the oprofile I would have assumed the work would dominate the time spent.
[12:23:35] Jordack (Jordack!~jordack@h69-131-44-221.plmomi.dedicated.static.tds.net) has joined #mythtv
[12:25:02] stuartm: MainServer::ProcessRequestWork() and MainServer::HandleFileTransferQuery() were 2nd and 3rd in the oprofile output before that change, that hasn't really altered much but the %s are down
[12:26:11] danielk22: Are you profiling the kernel too? Maybe the work is hidden there?
[12:26:45] stuartm: just mythbackend atm
[12:27:11] danielk22: My guess if that is showing up in the profile very little time is actually spent in mythbackend. Most is spent in the kernel.
[12:27:21] stuartm: this is what it looks like now, scheduler is running so that messes it up a little – http://pastebin.com/wkAHucUB
[12:27:38] stuartm: danielk22: oh it is
[12:28:27] danielk22: stuartm: Are you seeing the high cpu usage?
[12:29:34] stuartm: yes
[12:30:26] stuartm: backend is currently using 64% cpu, restarted it less than 30 minutes ago
[12:30:29] danielk22: Beirdo: Those extra 8MB anon blocks went away overnight..
[12:30:46] danielk22: stuartm: Wow!
[12:30:47] stuartm: climbs steadily, and faster than the leak
[12:30:57] stuartm: although only when watching a recording
[12:33:02] stuartm: if I restart it and just leave it alone it won't rise about 5–6%, which is how it stayed overnight, moment I start watching a recording both memory and CPU start rising
[12:33:18] danielk22: Have you tried "top -H" to isolate the high CPU threads and then attaching to those?
[12:33:57] stuartm: I'm not sure the CPU usage is a direct result of memory fragmentation as the CPU seems to get pretty high even when memory usage is still small e.g. 17% for just ~60MB Resident
[12:34:41] stuartm: danielk22: hah, I did look at top -H but I didn't think to attach to just those threads
[12:35:11] stuartm: the usage was split over two threads in particular
[12:37:18] stuartm: hmm, the event loop?
[12:38:24] stuartm: no, it's a pool thread so that doesn't make sense
[12:40:00] danielk22: Can you paste the BTs?
[12:54:10] cattelan_away is now known as cattelan
[13:19:25] stuartm: 2012-06–25 14:19:05.691446 E PSIP table 0x82 is invalid
[13:19:26] stuartm: 2012-06–25 14:19:05.720705 E PSIP packet failed CRC check. pid(0xc0) type(0x82)
[13:19:28] stuartm: oops
[13:20:55] stuartm: btw, if you want to hang the frontend, just kill the backend while it's playing back a recording – pretty sure that's a regression
[13:21:27] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has joined #mythtv
[13:21:27] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has quit (Changing host)
[13:21:27] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[13:21:54] danielk22: stuartm: That doesn't work here..
[13:22:21] stuartm: gets completely stuck here, but that's another issue for another time :)
[13:23:12] danielk22: Heh, yeah :) Can you get me backtraces on those two busy threads? I might be able to figure out what is going on from that..
[13:24:24] danielk22: BTW Most of the initial virt allocation appears to be thread local storage. It has jumped because we now have an extra 10 zeromq threads.
[13:26:12] danielk22: I think the default has also gone up. I'm pretty sure it wasn't traditionally 64MB.
[13:26:40] danielk22: But that memory is hardly touched so it only accounts for a page or two of resident memory.
[13:27:41] danielk22: I can't figure out if it is ld or Qt that's requesting the 64MB.
[13:28:03] danielk22: (64 MB per thread in addition to 8MB for the stack.)
[13:28:50] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has quit (Read error: Connection reset by peer)
[13:29:42] danielk22: Ok, QObject calls __tls_get_addr..
[13:30:40] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has joined #mythtv
[13:30:46] stuartm: danielk22: not much to show on those, every time I grab the bts they look like threads 33 and 39 in this full BT – http://pastebin.ca/2164505
[13:31:03] stuartm: if I try to step through them or continue the backend just dies
[13:32:04] danielk22: Hmm, yeah.. knowing that they are pool threads doesn't tell us much. Just that it is some QRunnable using up CPU.
[13:32:11] danielk22: That is something though.
[13:32:40] danielk22: Should be possible to have MPoolThread print out the name of each runnable it runs...
[13:36:09] Goga777 (Goga777!~Goga777@128-71-247-145.broadband.corbina.ru) has quit (Read error: Connection reset by peer)
[13:38:44] rsiebert_ (rsiebert_!~quassel@e179130056.adsl.alicedsl.de) has joined #mythtv
[13:42:05] rsiebert (rsiebert!~quassel@e179130208.adsl.alicedsl.de) has quit (Ping timeout: 260 seconds)
[13:44:01] stuartm: --loglevel debug does that afaik
[13:48:46] stuartm: ah, plus VERBOSE_THREADS
[13:54:50] stuartm: but if I try to enable it the backend deadlocks on startup
[13:55:31] stuartm: LOG() called in the LoggerThread constructor no es bueno
[13:55:40] stuartm: Beirdo: ^^
[14:14:47] danielk22: stuartm: Can you try reverting [f2a0d748] ?
[14:18:28] pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has joined #mythtv
[14:20:41] gregL (gregL!~greg@cpe-74-76-125-87.nycap.res.rr.com) has quit (Quit: Leaving)
[14:22:21] pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has quit (Client Quit)
[14:24:48] stuartm: danielk22: you're thinking something to do with ProcessRequestRunnable?
[14:25:14] danielk22: Yeah, leaking sockets
[14:27:29] stichnot: danielk22: [f84028861c163520c9da11d5787126ee0cfb497e] is a "git bisect bad" commit that I found, and I believe it predates your commit, assuming I'm seeing the gitk graph correctly
[14:28:26] danielk22: stichnot: This is based on initial memory consumption or CPU use?
[14:28:51] stichnot: this is the growth over time issue, which (in my case) correlates with CPU use
[14:29:39] stuartm: danielk22: doesn't look good, cpu/memory are still climbing, but I'll leave it playing for bit longer just to be sure
[14:30:06] stichnot: btw, when doing the git bisect, I did notice that the initial VIRT size jumped after the zeromq merge, which you guys diagnosed as more thread stacks
[14:30:27] danielk22: stichnot: Yep, 10 more threads * (8MB+64MB)
[14:30:29] amessina (amessina!~amessina@2001:470:1f11:a4:d6be:d9ff:fe8d:7c1e) has joined #mythtv
[14:31:07] stuartm: ok, that confirms it, reverting that commit did not help
[14:31:11] danielk22: Maybe we can decrease the size of the zeromq thread pool, but it does look like it is a red herring.
[14:31:42] danielk22: stuartm: Thanks, just wanted to be sure. It's the only thing I could think of in the reference counting that could cause a problem on the backend.
[14:32:13] stuartm: the issue occurs much more quickly for HD recordings, so the file transfer/ringbuffer look most likely
[14:32:58] danielk22: stuartm: But there haven't been any changes there recently AFAIK.
[14:33:21] stuartm: danielk22: right, I can't see anything in the git logs :/
[14:33:38] stuartm: indirectly related maybe
[14:34:01] danielk22: Yeah, could be some interaction between it and something else.
[14:34:11] stuartm: danielk22: when you disabled logging you just modified the LOG macro?
[14:34:32] danielk22: I just modified the LOG macro. Not quite sure how to disable the logging altogether.
[14:37:29] danielk22: Besides I can't seem to reproduce the backend issues..
[14:38:12] stuartm: yeah I can't really see how that would translate into the issues I'm seeing if nothing gets past VERBOSE_LEVEL_CHECK()
[14:40:34] stuartm: I'll fix thread registration logging later and see whether that sheds any light
[14:42:45] stuartm: danielk22, stichnot: what version of QT?
[14:43:10] danielk22: 4.8.1
[14:43:32] stuartm: https://github.com/MythTV/mythtv/commit/f8402886
[14:44:27] danielk22: stuartm: That can't be the commit, it doesn't even get used on Linux, it must be one before or after.
[14:45:35] danielk22: last good [79dc4b50], earliest bad [f8402886].
[14:45:47] danielk22: stichnot: wasn't able to test in between
[14:46:01] stichnot: Qt 4.6.2
[14:46:55] stichnot: just a reminder — when I ran vtune on the backend a few days ago, most of the CPU was in malloc/free related code, but apart from that, the top method was something about Qt events. I would look there first.
[14:48:43] stuartm: danielk22: aye, I was just using my autocomplete to get a url :)
[14:49:04] stuartm: since mythlogbot isn't providing those any more
[14:50:17] stuartm: I've got some work to do, I'll look at it again tonight
[14:53:37] danielk22: as far as I can tell from the github browser the only thing there that could cause an issue is the zeromq logging merge.
[14:55:55] stichnot: Has anyone else reproduced my last good / earliest bad finding? It would be good to have confirmation.
[14:56:21] danielk22: stuartm: ^^ Can you, since I'm not seeing the problem I can't reproduce...
[15:04:10] stichnot: and stuartm: what version of Qt are you using?
[15:06:10] Beirdo: stuartm: what's wrong with using LOG() in LoggerThread ctor?
[15:09:11] Beirdo: should work just fine, all it will do is put the item on the queue that the LoggerThread will later be reading
[15:10:14] taylorr (taylorr!~taylorr@unaffiliated/elmojo) has quit (Ping timeout: 272 seconds)
[15:10:26] Beirdo: anyways, I gotta go :) new job starts today, and the bus is in about 15min
[15:13:31] taylorr (taylorr!~taylorr@cpe-173-095-145-027.nc.res.rr.com) has joined #mythtv
[15:13:31] taylorr (taylorr!~taylorr@unaffiliated/elmojo) has joined #mythtv
[15:13:31] taylorr (taylorr!~taylorr@cpe-173-095-145-027.nc.res.rr.com) has quit (Changing host)
[15:16:09] Goga777 (Goga777!~Goga777@128-71-247-145.broadband.corbina.ru) has joined #mythtv
[15:20:10] Goga777 (Goga777!~Goga777@128-71-247-145.broadband.corbina.ru) has quit (Read error: Connection reset by peer)
[15:24:37] stichnot (stichnot!chatzilla@mythtv/developer/stichnot) has quit (Ping timeout: 252 seconds)
[15:29:25] danielk22: Beirdo: stichnot: stuartm: PollingZMQSocket::poll() could be vulnerable to a spin. We only handle EVT_POLLIN revents, but if we're using POSIX poll on some layer of this stack we could get a POLLERR, POLLHUP, or POLLNVAL.
[15:30:19] danielk22: This could be protected for in zmq. But I don't know what zmq::poll() actually refers to.
[15:32:51] danielk22: Well POLLHUP and POLLNVAL can't happen since this is a read.
[15:37:16] danielk22: Ok, shouldn't be an issue. It looks like this maps to zmq_poll which only return POLLERR if it is a regular socket, not if it is a zmq socket.
[15:41:47] stuartm: Beirdo: it causes a deadlock
[15:42:58] stuartm: Beirdo: "export VERBOSE_THREADS=1" and then try to run an app (frontend/backend/etc) and it instantly deadlocks
[15:52:16] Seeker`: stuartm: cabt read backlog as im on an iphone so dont know if this has been discussed, but might something like QMetaEnum be more useful for the command
[15:52:19] Seeker`: decoding
[15:55:15] danielk22: stuartm: You can just comment out that LOG in the LoggerThread ctor
[16:10:56] pheld (pheld!~heldal@cl-5.osl-01.no.sixxs.net) has joined #mythtv
[16:23:13] dekarl: stuartm: see #10495, wrt 2012-06–25 14:19:05.720705 E PSIP packet failed CRC check. pid(0xc0) type(0x82)
[16:23:13] ** MythLogBot http://code.mythtv.org/trac/ticket/10495 **
[16:33:13] joki (joki!~joki@p548625C5.dip.t-dialin.net) has quit (Ping timeout: 245 seconds)
[16:33:46] joki (joki!~joki@p5486599C.dip.t-dialin.net) has joined #mythtv
[16:45:12] Mousey (Mousey!~r0dent_@ross154.net) has joined #mythtv
[16:48:25] jmusits (jmusits!~jmusits@66-109-48-242.tvc-ip.com) has joined #mythtv
[16:51:15] stichnot (stichnot!~chatzilla@192.55.54.40) has joined #mythtv
[16:51:15] stichnot (stichnot!~chatzilla@mythtv/developer/stichnot) has joined #mythtv
[16:51:15] stichnot (stichnot!~chatzilla@192.55.54.40) has quit (Changing host)
[16:59:54] jmusits is now known as lctdt
[17:00:10] lctdt is now known as jmusits
[17:01:17] NightMonkey (NightMonkey!~NightrMon@pdpc/supporter/professional/nightmonkey) has joined #mythtv
[17:01:46] jmusits: hey xris: do you have any plans for MythWeb on PHP 5.4?
[17:02:34] jmusits: there were a few obvious fatals that I have fixed, but haven't extensively audited yet
[17:06:20] zombor_ (zombor_!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has joined #mythtv
[17:06:20] zombor_ (zombor_!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has quit (Changing host)
[17:06:20] zombor_ (zombor_!~zombor_@kohana/developer/zombor) has joined #mythtv
[17:06:46] zombor_ (zombor_!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection)
[17:07:10] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Read error: Connection reset by peer)
[17:07:15] zombor_ (zombor_!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has joined #mythtv
[17:07:15] zombor_ (zombor_!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has quit (Changing host)
[17:07:15] zombor_ (zombor_!~zombor_@kohana/developer/zombor) has joined #mythtv
[17:13:59] SteveGoodey (SteveGoodey!~steve@host86-160-43-123.range86-160.btcentralplus.com) has joined #mythtv
[17:25:26] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has quit (Ping timeout: 246 seconds)
[17:31:09] xris: jmusits: didn't I commit the fixes from the bug in trac?
[17:31:55] jmusits: xris: I didn't notice them as of 2–3 weeks ago
[17:32:04] jmusits: didn't look through trace yet
[17:32:09] jmusits: *trac
[17:32:18] xris: yeah, it was after that.. less than 2 weeks ago
[17:32:32] jmusits: ok cool
[17:32:34] xris: #10504
[17:32:34] ** MythLogBot http://code.mythtv.org/trac/ticket/10504 **
[17:34:23] kormoc (kormoc!~kormoc@mythtv/developer/kormoc) has joined #mythtv
[17:39:44] xris: It was right after I upgraded to fedora 17...
[17:40:25] jmusits: gotcha, I noticed when I migrated to a new box and upgraded to 0.25
[17:51:06] xris: ah. yeah, I didn't backport the fixes to .25
[17:52:12] jmusits: I'm running off of the .25 fixes branch, and manually patched it
[17:52:33] stichnot: danielk22, stuartm, Beirdo: I found a much more effective way of demonstrating the mythbackend memory growth (at least on my system) — run "mythcommflag --rebuild" on a remote machine. (If you run it on the backend machine, I believe it accesses the file directly, not through backend streaming.) Each run increases VIRT and RES size by about 100MB.
[17:54:00] stichnot: Here are some VIRT/RES stats (in MB) after successive mythcommflag runs. 913/479 > 1045/590 > 1150/696 > 1258/805 > 1371/907 > 1489/1.0g
[17:55:37] xris: jmusits: you have commit access? if not, feel free to send me a patch and I'll apply it.
[17:56:23] jmusits: I do not have commit access, but if you already committed to master then I don't think you would need a patch from me
[17:57:27] xris: ah, I suppose the patches *would* apply cleanly. it's not like kormoc or I has done anything with mythweb in ages.
[17:58:26] jmusits: in both the MythTV PHP bindings and in Mythweb
[17:58:45] jmusits: weird, that message got chopped
[17:59:17] jmusits: I was just saying that the only things I patched were the pass by reference fatals
[17:59:27] jmusits: which it looks like is what you patched
[17:59:57] stichnot: danielk22, stuartm, Beirdo: I say that it's much more effective because I can see the profound growth in just a few seconds, instead of having to wait many minutes.
[18:04:54] jmusits: actually, I misspoke – I had one additional change to one of the wap templates
[18:05:33] jmusits: I can send you that patch if you want
[18:10:08] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has joined #mythtv
[18:15:01] amessina (amessina!~amessina@2001:470:1f11:a4:d6be:d9ff:fe8d:7c1e) has quit (Quit: Konversation terminated!)
[18:19:15] xris: wow, we still have WAP templates… heh
[18:19:40] xris: yeah, feel free to send a patch/ticket.
[18:22:36] jmusits: http://code.mythtv.org/trac/ticket/10856
[18:30:34] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has joined #mythtv
[18:30:34] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has quit (Changing host)
[18:30:34] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[18:30:50] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Remote host closed the connection)
[18:31:07] zombor_ (zombor_!~zombor_@kohana/developer/zombor) has quit (Read error: Connection reset by peer)
[18:31:19] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has joined #mythtv
[18:31:20] zombor (zombor!~zombor_@50-73-122-41-ip-static.hfc.comcastbusiness.net) has quit (Changing host)
[18:31:20] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[18:32:50] natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has joined #mythtv
[18:41:28] stuartm: stichnot, danielk22: I've ruled out the Playback Sock and FileTransfer reference counter commits, I think that might be it for reference counter changes
[18:48:22] stichnot: I did a "git diff 79dc4b50 f8402886" and filtered out the stuff in the external directory, leaving 4800 lines of diffs. I'll look through that for clues.
[18:49:35] stuartm: I'm fishing in commits for the last 30 days in specific libs and the backend itself
[18:49:59] stuartm: unfortunately due to the way git merges commits that means missing a fair number
[18:51:06] stuartm: i.e. a merge can include a year old commit, so even if the problem is recent the commit which caused it appears a year back in the logs :(
[19:15:33] danielk22: stichnot: Interestingly that grows RES here, but not so much VIRT.
[19:19:17] cesman (cesman!~cesman@pdpc/supporter/professional/cesman) has joined #mythtv
[19:30:28] danielk22: Through many runs I've gotten RES from <600m up to 1G, but virt has just moved from 2180m to 2386m
[19:31:07] stuartm: 600m sounds extremely high to begin with?
[19:32:54] danielk22: I dunno, from a virt of 2G+ ? The machine I'm testing on has no swap enabled.
[19:34:59] stichnot: I don't know much about Qt event processing, so this may be all off, but... Is there any way the logging thread could be endlessly queuing up events/messages meant for other threads, and not acting on them or removing them from its queue? On a clean shutdown, those would presumably get cleaned up and valgrind would be happy.
[19:37:15] danielk22: stichnot: If it sent queued signals to a QObject owned by a thread not running an event loop that is what would happen.
[19:37:21] stuartm: danielk22: well that virt seems high too
[19:37:47] danielk22: But then these wouldn't be getting logged to whatever logging facility they were supposed to be logged to.
[19:38:11] stuartm: this machine has been swapping but on a 0.24 machine without swap the backend is using just ~830m virt, 107m res
[19:38:27] danielk22: stuartm: That virt is pretty close to the starting virt.
[19:38:55] stuartm: on this machine, even with some growth I'm only experiencing 1646m 78m
[19:39:08] danielk22: stuartm: zeromq adds 10 threads, or 720MB to the initial virt.
[19:39:17] stuartm: wow
[19:40:21] stichnot: danielk22: I'm thinking of the logging thread receiving copies of the ringbuffer request messages or responses
[19:40:39] danielk22: The virt is up to 2522m and the res is up to 1.1g, so both are growing, just not at a rate of 100MB.
[19:43:43] stuartm: so ... do we really need logging? :p
[19:43:44] stichnot: I don't think this has anything to do with log-related events. After one remote mythcommflag run, process size increases by 100MB but a paltry 3KB of logs are written out.
[19:44:52] danielk22: stichnot: I can't imagine how the logging thread would receive ringbuffer request messages or responses.
[19:46:45] stichnot: If some QObject were set up as an event listener but lacked an event processing loop, could it get copies of such events, which would get queued endlessly?
[19:46:55] stichnot: like I say, I know very little of this architecture
[19:48:01] danielk22: You need to explicitly register to see MythEvent events.. but...
[19:48:22] Seeker`: Seen this disucssion going on for a few days; have you found the cause yet?
[19:48:28] danielk22: LogForwardThread::run() runs qApp->processEvents() but doesn't actually call exec()
[19:48:52] danielk22: processEvents processes most events, but it doesn't process QObject::deleteLater()...
[19:49:48] fully_human (fully_human!~fully_hum@50-39-129-183.bvtn.or.frontiernet.net) has joined #mythtv
[19:50:08] stichnot: same with LoggerThread, right?
[19:51:14] fully_human: Hey, all. I was wondering if anyone could recommend a TV tuner with a PCI-e connection (preferably around $50 or less). I've been searching for ages online. I have Linux Mint 12. Thanks. :-)
[19:51:16] danielk22: LoggerThread doesn't call processEvents().
[19:51:49] Seeker`: fully_human: better to ask in #mythtv-users
[19:51:57] fully_human: Ah, sorry.
[19:52:15] fully_human (fully_human!~fully_hum@50-39-129-183.bvtn.or.frontiernet.net) has left #mythtv ("Leaving")
[19:52:19] stichnot: according to logging.cpp:361 it does (unless I'm looking at the wrong version or something)
[19:53:02] danielk22: Nah, I was looking at the wrong thing.
[19:53:52] stuartm: Seeker`: not yet no, and it doesn't feel like we're any closer although I suppose we have ruled out some things
[19:54:10] Seeker`: stuartm: narrowed it down to a commit range?
[19:54:47] stuartm: Seeker`: supposedly, although it's not so easy to do that with git
[19:55:29] Seeker`: stuartm: is this symptomatic: 11589 mserv 20 0 3878m 1.1g 10m S 17 29.1 253:49.50 mythbackend
[19:55:34] Seeker`: (Im guessing yes)
[19:55:39] danielk22: Beirdo: FYI DeferredDelete events are not processed in QCoreApplication::processEvents(QEventLoop::AllEvents);
[19:56:14] Seeker`: if so, thats v0.26-pre-641-g8438a0e-dirty, just as a data point
[19:56:22] stuartm: Seeker`: yeah, anything higher than ~150m res I'd call abnormal
[19:57:25] stuartm: what's 'normal' is probably lower still, but it's hard to be definitive given the variables
[19:57:53] danielk22: stichnot: It's not really safe, but can you try adding a QCoreApplication::processEvents(QEventLoop::DeferredDeletion); right after each processEvents in those threads? If that works we can just rework those two threads to the Qt event loop.
[19:58:19] stichnot: sure, I'll try that.
[19:58:27] danielk22: s/to the Qt event loop/to use the Qt event loop/
[19:58:51] stuartm: that 0.24 box of mine has an uptime of a few weeks and still only uses 108m res ... so things aren't looking so great by comparison in master
[19:59:02] Seeker`: stuartm: ah, yes.
[19:59:40] Seeker`: not sure if my latest build is within the narrowed commit range or not
[20:00:05] danielk22: stichnot: That may need to be "QApplication::sendPostedEvents(0, QEvent::DeferredDelete);"
[20:01:24] stichnot: danielk22: you're saying if the first change has no effect, try the second change instead?
[20:01:28] danielk22: Yeah.
[20:03:22] Seeker`: has anyone tried overloading new / calloc to catch leaks?
[20:03:46] danielk22: Even that may not work, deferred deletes firing at the wrong time was apparently a huge problem with Qt3; so they are only supposed to happen when you the "non-recursive event loop", which you can only enter via exec().
[20:03:59] stichnot: same behavior from the first one. Trying the second...
[20:04:08] xris: wagnerrp: https://github.com/MythTV/mythtv/commit/08b88 . . . d8c937ab95f2
[20:05:22] wagnerrp: seems fine to me
[20:08:06] danielk22: If the sendPostedEvents() doesn't work this is probably not the issue.
[20:10:36] stichnot: same behavior from the second one. I only changed the one instance in logging.cpp and the two instances in loggingserver.cpp, but there are more instances of processEvents that I didn't modify
[20:17:05] danielk22: stichnot: Then this likely isn't the problem. There probably are no deferred deletes in those loops.
[20:18:07] SteveGoodey (SteveGoodey!~steve@host86-160-43-123.range86-160.btcentralplus.com) has quit (Remote host closed the connection)
[20:22:52] danielk22: stichnot: I was able to grow my backend to 2762m VIRT and 1.3g RES.
[20:23:47] stichnot: using the mythcommflag trick?
[20:23:53] danielk22: yep
[20:23:59] stichnot: good :)
[20:24:33] danielk22: When I restarted mythbackend was down to 1628m VIRT, 33m RES.
[20:24:42] Beirdo: stichnot: what did ya change?
[20:25:23] Beirdo: sonva...
[20:25:34] Beirdo: OK, yeah, let's add the deferred delete execs :)
[20:26:52] stichnot: Beirdo: in the 3 instances of "qApp->processEvents(QEventLoop::AllEvents, 10);" in logging.cpp and loggingserver.cpp, I added "qApp->processEvents(QEventLoop::DeferredDeletion);". When that did nothing, I change it to "qApp->sendPostedEvents(0, QEvent::DeferredDelete);". That also did nothing.
[20:27:01] danielk22: Beirdo: We should handle deferred deletes of course, but it doesn't look like that is the issue.
[20:27:31] danielk22: QEventLoop::DeferredDeletion is gone in the latest Qt, but the sendPostedEvents should still work.
[20:28:33] Beirdo: Hmm
[20:28:50] Beirdo: OK. good to know
[20:29:02] Beirdo: I'm sure there are other threads that might need the same treatment
[20:29:44] Beirdo: ones that actually DO use deferred deletes :)
[20:30:47] danielk22: Beirdo: Ya gotta be careful with deferred deletes, that's why they made it so hard to do them. Unless it's a long lived thread it's best to let sleeping dogs lay.
[20:32:19] Beirdo: yeah, I see. The logging threads are all long-living though. I didn't remember if I had deferred deletes in the end or not.
[20:35:19] stichnot: Seeker`: I believe valgrind has been used, which is essentially what you suggest. So far there is no smoking gun, suggesting that the "leak" may be a controlled growth that gets cleaned up at shutdown.
[20:36:47] danielk22: stichnot: I tried valgrind and mtrace on the frontend, not the backend...
[20:36:58] zombor (zombor!~zombor_@kohana/developer/zombor) has quit (Ping timeout: 245 seconds)
[20:37:22] Seeker`: stichnot: I mean something which counts where memory is allocated (linked list showing total, __FILE__ and __LINE__) which you can then print out whenever you want
[20:37:53] stichnot: danielk22: #10846
[20:37:53] ** MythLogBot http://code.mythtv.org/trac/ticket/10846 **
[20:38:29] mrand (mrand!~mrand@ubuntu/member/mrand) has quit (Ping timeout: 252 seconds)
[20:38:42] Beirdo: I tried running with valgrind and --show-reachable=yes, but it toook FOOOREVER
[20:39:03] Beirdo: like couldn't even do the SocketThread stuff within half an hour
[20:39:12] Beirdo: It might have been blocked on something
[20:39:24] danielk22: stichnot: ic. still may be worth running with the mythcommflag trick. Should be able to amplify the signal vs the noise.
[20:39:27] stichnot: also, how do you get the backend to exit cleanly? or is that taken care of by the signal handlers?
[20:39:41] Seeker`: stichnot: that way you can just look for lines of code that have allocated more than, say, 1g of memory
[20:39:51] danielk22: stichnot: kill will do a controlled exit most of the time.
[20:40:01] danielk22: (just do one though.)
[20:40:37] Beirdo: I'll have to get a lowdown on what all you guys found later tonight, gotta get back to work :)
[20:41:14] Seeker`: o/
[20:41:54] Beirdo: Seeker`: there's nothing to do that.
[20:42:06] Beirdo: and it's not something asking for > 1G at a time either
[20:43:00] stichnot: also, I will bet that the growth is internal Qt structures, which would take a deeper stack trace to make any sense of
[20:43:04] danielk22: Seeker`: if it were big mallocs it would show up in pmap since big mallocs are mapped to mmaps.
[20:43:18] Beirdo: danielk22: I think I'll change the LoggingItem to use your new reference counter with auto-delete
[20:43:36] Beirdo: that is the one place LoggerThread does use deleteLater
[20:43:55] Seeker`: danielk22: what about multiple small allocs though
[20:44:10] Beirdo: still didn't get finished cleaning up the logserver startup
[20:44:17] danielk22: seeker just to many millions of them to step through.
[20:44:26] Beirdo: got distracted by my ballooning backend process
[20:44:35] stichnot: Beirdo: there is big growth even when there is little or no logging, so it's probably not a LoggingItem issue
[20:44:58] Beirdo: stichnot: fair enough, but it can't hurt to clean it up too :)
[20:45:03] stichnot: indeed.
[20:46:01] Seeker`: danielk22: hence why I was asking if anyone had tried overloading new etc, as you can automate a lot of it
[20:47:02] Beirdo: anyways, be back later.  :)
[20:52:18] stoffel (stoffel!~quassel@pD9E41EB5.dip.t-dialin.net) has quit (Remote host closed the connection)
[20:54:06] stichnot: I tried suppressing logging threads by removing the logStart() call from MythCommandLineParser::ConfigureLogging(), but that had little or no effect (though I don't know if removing that call is sufficient).
[20:54:08] cattelan (cattelan!~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has quit (Ping timeout: 240 seconds)
[20:54:31] mrand (mrand!~mrand@ubuntu/member/mrand) has joined #mythtv
[20:56:25] Beirdo: it's not
[20:56:39] Beirdo: you'd need to stub out the LOG() macro as well
[20:57:12] Beirdo: not running logStart would just fill the queue and use up memory :)
[20:58:05] Beirdo: do that and change LOG() in mythlogging.h to #define LOG(whatever) (void)(0)
[20:58:08] Beirdo: or the like
[20:58:18] Beirdo: then make clean, make
[20:58:20] Beirdo: :)
[20:58:32] Beirdo: then it will never create the LoggingItem to start with
[21:00:40] stichnot: actually I'm trying to keep the logging threads from starting, along with their event processing/listening/etc., under the theory that they were responsible for queues filling up with non-log related events. which may be total BS since I don't really know how the event system works.
[21:01:11] Beirdo: well, yeah, you'd want to disable them too if you want to rule out logging being an issue
[21:01:35] Beirdo: just a note: nzmqt uses QTimer internally :(
[21:01:47] Beirdo: I noticed that last night
[21:02:30] Beirdo: if you stub out LOG() and never call logStart(), you should be OK, I think (well and not call logStop() on shutdown)
[21:02:50] Beirdo: would certainly quiet it down significantly
[21:03:19] Beirdo: if it still misbehaves without that, we can be pretty sure the issue is mainly elsewhere
[21:03:44] amejia (amejia!~andres@129.174.97.130) has joined #mythtv
[21:03:44] amejia (amejia!~andres@xbmc/staff/amejia) has joined #mythtv
[21:03:44] amejia (amejia!~andres@129.174.97.130) has quit (Changing host)
[21:05:27] danielk22: Beirdo: My fix for QTimer is now in Qt5 master.. so eventually it will be safe to use...
[21:06:37] cattelan (cattelan!~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #mythtv
[21:14:34] xris: interesting writeup for those of you guys doing UI stuff: http://blog.mecheye.net/2012/06/the-linux-graphics-stack/
[21:36:59] jmusits (jmusits!~jmusits@66-109-48-242.tvc-ip.com) has quit (Quit: leaving)
[21:41:45] danielk22: stichnot: I think you are probably right about the memory being under Qt control somehow.
[21:43:16] Beirdo: I'm still of the mindset that it might be related to deleteLater stuff somewhere maybe, but...
[21:43:39] ** Beirdo is waiting for a IMG->VMDK conversion to finish **
[21:44:23] Seeker`: IMG? VMDK?
[21:44:44] Beirdo: yeah, converting a vmware image around. anyways.. :)
[21:47:18] stuartm: Beirdo: in LoggerThread::handleItem() ln 536, is that not going to leak if m_threadName is already set? Which should be the case since we've called fillItem() at that point
[21:48:00] Beirdo: one sec
[21:49:57] danielk22: stuartm: looks like it is free'd on the previous line of code..
[21:50:50] danielk22: nm, looking at wrong line
[21:51:03] Beirdo: line 536?
[21:51:31] Beirdo: that won't be set, I don't think
[21:51:35] stuartm: item->m_threadName = logThreadHash[item->m_threadId];
[21:52:29] stuartm: assumes that item->m_threadName will be NULL, although FillItem() will be called before that point which sets m_threadName
[21:52:41] Beirdo: hmm, yeah, possibly that could cause a wee leak
[21:52:46] Beirdo: :)
[21:52:54] Beirdo: only on deregistration, but good catch
[21:53:10] Beirdo: I'd reworked some of that, looks like that one got missed
[21:53:27] stuartm: yep, tiny leak but it caught my eye
[21:53:59] Beirdo: I'd fixed the one in registration above, but not that one
[21:54:30] Beirdo: not to mention those two lines can get changed into a logThreadHash.take() like above
[21:54:59] Beirdo: although that'd still be a leak
[21:55:06] Beirdo: I need to fix that :)
[21:55:16] Beirdo: should do a take to get it out of the hash, and then free it
[21:55:34] Beirdo: as it gets strduped going in
[21:57:19] natanojl (natanojl!~jonatan@c83-252-237-63.bredband.comhem.se) has quit (Ping timeout: 252 seconds)
[22:07:38] NightMonkey (NightMonkey!~NightrMon@pdpc/supporter/professional/nightmonkey) has quit (Ping timeout: 240 seconds)
[22:08:50] NightMonkey (NightMonkey!~NightrMon@pdpc/supporter/professional/nightmonkey) has joined #mythtv
[22:09:43] Beirdo: there. fixed that one
[22:32:00] mrand (mrand!~mrand@ubuntu/member/mrand) has quit (Ping timeout: 246 seconds)
[22:38:35] stuartm: Beirdo: it would be interesting to turn coverity lose on master, there's a remote possibility that it will pick up on something we're missing
[22:39:05] stuartm: did you get anywhere on automating the build?
[22:51:40] zombor (zombor!~zombor_@65.29.231.135) has joined #mythtv
[22:51:40] zombor (zombor!~zombor_@65.29.231.135) has quit (Changing host)
[22:51:40] zombor (zombor!~zombor_@kohana/developer/zombor) has joined #mythtv
[22:59:09] Beirdo: not yet. I ran outta time on the weekend, but thanks for the reminder :)
[23:01:58] Beirdo: yeah, coverity might find an "oh duh!" moment
[23:03:27] NightMonkey (NightMonkey!~NightrMon@pdpc/supporter/professional/nightmonkey) has quit (Ping timeout: 244 seconds)
[23:03:36] stuartm: I don't think static analysis is going to solve this one, but then again maybe it's a lot simpler than we think and we're chasing ghosts in the machine when the answer was right in front of us
[23:04:01] Beirdo: that is often the case on the tougher to find ones
[23:04:20] Beirdo: BRB
[23:06:39] Beirdo: Gotta love stupid reboots... kernel upgrade
[23:16:05] NightMonkey (NightMonkey!~NightrMon@pdpc/supporter/professional/nightmonkey) has joined #mythtv
[23:22:31] amejia (amejia!~andres@129.174.97.130) has joined #mythtv
[23:22:31] amejia (amejia!~andres@129.174.97.130) has quit (Changing host)
[23:22:31] amejia (amejia!~andres@xbmc/staff/amejia) has joined #mythtv
[23:28:03] Mousey (Mousey!~r0dent_@ross154.net) has quit (Remote host closed the connection)
[23:28:18] dblain_ (dblain_!~dblain@c-76-127-227-175.hsd1.ma.comcast.net) has joined #mythtv
[23:28:21] Mousey (Mousey!~r0dent_@ross154.net) has joined #mythtv
[23:28:34] cesman (cesman!~cesman@pdpc/supporter/professional/cesman) has quit (Read error: Connection reset by peer)
[23:28:57] cesman (cesman!~cesman@pool-108-0-54-134.lsanca.fios.verizon.net) has joined #mythtv
[23:28:57] cesman (cesman!~cesman@pdpc/supporter/professional/cesman) has joined #mythtv
[23:28:57] cesman (cesman!~cesman@pool-108-0-54-134.lsanca.fios.verizon.net) has quit (Changing host)
[23:31:21] ghoti (ghoti!~paul@scratch.it.ca) has quit (Ping timeout: 244 seconds)
[23:32:04] ghoti (ghoti!~paul@scratch.it.ca) has joined #mythtv
[23:33:22] dblain_ is now known as dblain
[23:33:51] dblain (dblain!~dblain@mythtv/developer/dblain) has joined #mythtv
[23:33:51] dblain (dblain!~dblain@c-76-127-227-175.hsd1.ma.comcast.net) has quit (Changing host)
[23:34:58] rich0 (rich0!~rich0@gentoo/developer/rich0) has quit (Ping timeout: 244 seconds)
[23:36:00] Vernon_at_work (Vernon_at_work!~singv003@lightcloud.verns.net) has quit (Ping timeout: 244 seconds)
[23:36:24] Vernon_at_work (Vernon_at_work!~singv003@lightcloud.verns.net) has joined #mythtv
[23:40:08] xavierh (xavierh!~xavier@cpc1-swin3-0-0-cust274.3-1.cable.virginmedia.com) has quit (Ping timeout: 240 seconds)
[23:45:54] xavierh (xavierh!~xavier@cpc1-swin3-0-0-cust274.3-1.cable.virginmedia.com) has joined #mythtv
[23:47:39] rich0 (rich0!~rich0@gentoo/developer/rich0) has joined #mythtv
[23:54:41] stichnot (stichnot!~chatzilla@mythtv/developer/stichnot) has quit (Ping timeout: 255 seconds)

IRC Logs collected by BeirdoBot.
Please use the above link to report any bugs.