15:29:46 #startmeeting Pulp Triage 2016-11-22 15:29:46 #info bizhang has joined triage 15:29:47 Meeting started Tue Nov 22 15:29:46 2016 UTC and is due to finish in 60 minutes. The chair is bizhang. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:29:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:29:47 The meeting name has been set to 'pulp_triage_2016_11_22' 15:29:47 bizhang has joined triage 15:29:50 !here 15:29:50 #info mhrivnak has joined triage 15:29:51 mhrivnak has joined triage 15:29:57 anyone mind if we start with https://pulp.plan.io/issues/2431 ? 15:29:59 Title: Issue #2431: Potential data corruption - Pulp (at pulp.plan.io) 15:30:08 !2431 15:30:08 Error: "2431" is not a valid command. 15:30:10 !here 15:30:10 #info bmbouter has joined triage 15:30:10 looks a little scary 15:30:10 bmbouter has joined triage 15:30:22 !issue #2431 15:30:22 Error: '#2431' is not a valid positive integer. 15:30:28 !issue 2431 15:30:29 So close 15:30:29 !issue 2431 15:30:29 #topic Potential data corruption - http://pulp.plan.io/issues/2431 15:30:30 Pulp Issue #2431 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:30:31 Potential data corruption - http://pulp.plan.io/issues/2431 15:30:37 asmacdo, you are not the meeting chair 15:30:43 pulpbot told me 15:30:43 Error: "told" is not a valid command. 15:30:48 I know :) 15:30:49 lol :) 15:31:16 anyway, I am just looking at this one and im worried it might be bad? 15:31:32 I'm maybe misunderstanding... 15:32:04 I appreciate the reproducer, but am now reading a bunch of code to figure out what the bug is. 15:32:11 If a user specifies bad metadata, pulp stores it. 15:32:12 Anyone have a summary? 15:32:19 what is "bad" metadata? 15:32:22 i think hes saying that he can corrupt the metadata through the user metadata endpoint? 15:32:28 hm, ok. 15:32:28 that's the part I don't understand 15:32:30 !here 15:32:30 #info dkliban has joined triage 15:32:31 dkliban has joined triage 15:32:43 If you knowing inject bad metadata into pulp, you can break it 15:33:40 knowingly. Without a more "real-world" use-case, along the lines of why a user would do this in normal pulp usage (assuming POSTing bad metadata is abnormal), this seems like it maybe isn't a bug. 15:33:49 So...I'm maybe misunderstanding. :) 15:33:53 i think theres an important distinction. if it is the "custom user data" endpoint, permissions might be different than an admin right? 15:34:20 Does this reproducer use that endpoint? 15:34:43 Im not sure but it sounded like that was what he was talking about in the bug 15:35:03 I don't see "custom user data" in the bug 15:35:04 looks like it's just using the upload API. 15:35:11 url = 'repositories/%s/actions/import_upload/' % REPO_ID 15:35:13 I think I understand. 15:35:35 "user to specify additional metadata" 15:35:41 I think I took that the wrong way 15:35:42 When uploading an RPM, you can optionally specify metadata that will override any values the server side would otherwise introspect. 15:36:03 im satisfied. do we want to close? 15:36:11 http://docs.pulpproject.org/dev-guide/integration/rest-api/content/upload.html#import-into-a-repository 15:36:18 I'm thinking needinfo, e.g. "reproducer needs comments and/or bug report needs more detail" 15:36:51 I think someone on our end should take a closer look at that code and comment 15:37:03 I'll comment. 15:37:19 !propose needinfo 15:37:19 #idea Proposed for #2431: This issue cannot be triaged without more info. 15:37:19 #info smyers has joined triage 15:37:20 smyers has joined triage 15:37:20 I think I understand what he's getting at, so I'll ask for clarification along those lines. 15:37:20 Proposed for #2431: This issue cannot be triaged without more info. 15:37:22 +1 15:37:43 #action mhrivnak will follow up 15:37:47 someone probably should run his reproducer and check first, as it might show the issue better 15:38:02 pcreech, ack 15:38:16 sounds like consensus 15:38:18 !accept 15:38:18 #agreed This issue cannot be triaged without more info. 15:38:18 Current proposal accepted: This issue cannot be triaged without more info. 15:38:19 6 issues left to triage: 2427, 2429, 2433, 2434, 2435, 2436 15:38:19 #topic Apache's XSendFile directive causes 404 errors when trying to download rpm packages via HTTP - http://pulp.plan.io/issues/2427 15:38:20 RPM Support Issue #2427 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:38:21 Apache's XSendFile directive causes 404 errors when trying to download rpm packages via HTTP - http://pulp.plan.io/issues/2427 15:38:34 !propose skip 15:38:34 #idea Proposed for #2427: Skip this issue for this triage session. 15:38:35 Proposed for #2427: Skip this issue for this triage session. 15:38:57 +1 15:39:07 +1 15:39:09 alanoe, ^ 15:39:17 why skip for triage? 15:39:26 we needsinfoed it last time 15:39:31 and there's no additional info on it 15:39:47 dkliban tested it and it worked for him which led me to this this was a bug against XSendFile ppc64 15:39:47 So we still need info? 15:40:19 We could close worksforme 15:40:23 theresabitofplasticinmyspacebar,brb 15:40:31 +1 to worksforme 15:40:54 dkliban could you also leave a comment with the test you performed to make sure I'm remembering this right 15:41:06 bmbouter: i'll leave a comment 15:41:06 ppc is an unsupported arch, so if dkliban or bmbouter could update the issue to reflect it works on a supported arch, I think there's nothing we can do 15:41:14 woot 15:41:19 thx 15:41:22 Sounds good. 15:41:33 !propose other close notabug 15:41:33 #idea Proposed for #2427: close notabug 15:41:34 Proposed for #2427: close notabug 15:41:50 !accept 15:41:50 #agreed close notabug 15:41:50 Current proposal accepted: close notabug 15:41:52 #topic repomd.xml in a published yum repository is empty when retrieved via HTTP - http://pulp.plan.io/issues/2429 15:41:52 5 issues left to triage: 2429, 2433, 2434, 2435, 2436 15:41:53 RPM Support Issue #2429 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:41:54 not worksforme? 15:41:54 repomd.xml in a published yum repository is empty when retrieved via HTTP - http://pulp.plan.io/issues/2429 15:42:08 smyers, ? 15:42:11 mhrivnak, sorry, I just typed the wrong thing 15:42:19 ack 15:42:25 !issue 2427 15:42:26 #topic Apache's XSendFile directive causes 404 errors when trying to download rpm packages via HTTP - http://pulp.plan.io/issues/2427 15:42:26 RPM Support Issue #2427 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:42:27 Apache's XSendFile directive causes 404 errors when trying to download rpm packages via HTTP - http://pulp.plan.io/issues/2427 15:42:32 !propose other close worksforme 15:42:32 #idea Proposed for #2427: close worksforme 15:42:33 !propose close worksforme 15:42:33 Proposed for #2427: close worksforme 15:42:34 Error: "propose" is not a valid command. 15:42:41 !accept 15:42:41 #agreed close worksforme 15:42:42 Current proposal accepted: close worksforme 15:42:42 #topic repomd.xml in a published yum repository is empty when retrieved via HTTP - http://pulp.plan.io/issues/2429 15:42:43 5 issues left to triage: 2429, 2433, 2434, 2435, 2436 15:42:44 RPM Support Issue #2429 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:42:45 :! 15:42:45 repomd.xml in a published yum repository is empty when retrieved via HTTP - http://pulp.plan.io/issues/2429 15:43:00 sorry 'bout that 15:43:08 no worries. :) 15:43:23 !propose needsinfo 15:43:23 Error: "propose" is not a valid command. 15:43:41 it's needinfo 15:43:44 +1 15:43:45 :) 15:43:53 !propose needinfo 15:43:53 #idea Proposed for #2429: This issue cannot be triaged without more info. 15:43:53 #info daviddavis has joined triage 15:43:53 daviddavis has joined triage 15:43:54 Proposed for #2429: This issue cannot be triaged without more info. 15:44:11 We definitely need more info. 15:44:18 !accept 15:44:18 #agreed This issue cannot be triaged without more info. 15:44:18 Current proposal accepted: This issue cannot be triaged without more info. 15:44:19 4 issues left to triage: 2433, 2434, 2435, 2436 15:44:20 #topic Re-uploading results in old file still being served - http://pulp.plan.io/issues/2433 15:44:20 Pulp Issue #2433 [NEW] (unassigned) - Priority: Normal | Severity: High 15:44:21 Re-uploading results in old file still being served - http://pulp.plan.io/issues/2433 15:44:38 seems like a high priority issue 15:44:54 errr severity 15:44:58 This is an interesting case. The iso plugin perhaps should not allow more than one unit in a repo with the same name. 15:45:23 Or add an option to upload to replace any existing files with the same name. 15:45:37 yep, similar to rpm duplicate nevra, but likely easier to check since it's a single field and not a combo of 5 15:45:44 exactly. 15:46:06 !propose triage high high 15:46:06 #idea Proposed for #2433: Priority: High, Severity: High 15:46:07 Proposed for #2433: Priority: High, Severity: High 15:46:09 ? 15:46:33 smyers, did you note the bottom half of the bug report? 15:46:42 what pulp version did the duplicate nevra protection land in? 15:47:03 2.8 15:47:15 It's in 2.8.7, specifically 15:47:18 Interesting. 15:47:29 He seems to be saying it's not. 15:47:31 * mhrivnak shrugs 15:47:40 in any case, I think high/high is still fine. 15:48:07 I could be wrong, just going from memory on that 15:48:18 ack 15:48:40 +1 15:48:45 !accept 15:48:45 #agreed Priority: High, Severity: High 15:48:45 Current proposal accepted: Priority: High, Severity: High 15:48:46 3 issues left to triage: 2434, 2435, 2436 15:48:46 #topic Add timing output to sync task steps - http://pulp.plan.io/issues/2434 15:48:47 Pulp Issue #2434 [NEW] (unassigned) - Priority: Normal | Severity: High 15:48:48 Add timing output to sync task steps - http://pulp.plan.io/issues/2434 15:49:10 looks like an RFE 15:49:20 yea, can't you see the timings in pulp-admin? 15:49:26 it shows start and finish already 15:49:26 this is interesting, and likely related to the previous bug, because it's related to purge_duplicate_nevra being slow on 2.8.7 15:49:40 As a user, I want to know how long individual steps in a task took. 15:50:22 !suggest convert to story: "As a user, I want to know how long individual steps in a task took" 15:50:22 #idea convert to story: As a user, I want to know how long individual steps in a task took 15:50:25 I think this should be API user only 15:50:30 no cli 15:50:45 this would be through server.conf changes 15:50:55 ? 15:50:58 so it would be done by whoever is running pulp 15:51:01 server.conf ? 15:51:48 i was imagining that you couldh ave a setting in server.conf that enables timing for tasks 15:52:08 never mind 15:52:13 in pulp 2, the platform can't help with this. 15:52:20 is it possible to do this now? 15:52:23 because each plugin can do whatever it wants with its own progress report. 15:52:29 can you query task times? 15:52:33 yes. 15:52:41 start and finish times 15:52:47 ah 15:52:54 he wants to know how long specific parts of a task took. 15:53:07 yeah ... let's move on to the next issue though 15:53:10 So this would be a good fit for pulp 3. 15:53:12 yea, task steps 15:53:27 would this be low sev since it's an RFE? 15:53:29 Yeah, my understanding was timings on individual steps in a task, which...eek 15:53:41 But I think it's already in pulp 3, which is nice 15:54:01 we should close this 15:54:13 bizhang, when you convert it to a story you will lose the severity drop down 15:54:20 bmbouter, is this already in pulp 3? 15:54:32 the original need was really to investigate a performance problem 15:54:45 this could be done with a script combined with our api i think 15:54:50 which is better tracked alredy with https://pulp.plan.io/issues/1939 15:54:52 Title: Story #1939: As a user, I would like to be able to profile Pulp tasks - Pulp (at pulp.plan.io) 15:55:02 if we helped them make it it could live in pulp contrib 15:55:07 Yeah, I'm surprised this became an rfe for tasking timings, and not a "make purging duplicate nevra less slow" bug. 15:55:09 yeah we we never know what to time, using the cProfile report is so much better since it has all timings 15:55:37 !propose other close wontfix 15:55:37 #idea Proposed for #2434: close wontfix 15:55:38 Proposed for #2434: close wontfix 15:55:41 hang on 15:55:50 I get that this came from a performance investigation. 15:56:05 But this request seems distinct. He's asking for timings of steps within a task. 15:56:27 Is the argument that we should not do that? 15:56:32 I think hes asking for a debugging tool 15:56:33 right but knowing that a step took more time won't help you know why 15:56:54 It helps you know where to start digging. 15:57:03 not that timings are bad necessarily, but I don't think its what he needed 15:57:07 ehelms: ^? 15:57:23 Let's skip, and talk to ehelms about it before dismissing it as not valuable. 15:57:29 that sounds good 15:57:36 !propose skip 15:57:36 #idea Proposed for #2434: Skip this issue for this triage session. 15:57:36 Proposed for #2434: Skip this issue for this triage session. 15:57:36 thank you 15:57:41 !accept 15:57:41 #agreed Skip this issue for this triage session. 15:57:41 Current proposal accepted: Skip this issue for this triage session. 15:57:43 #topic systemd unit file for pulp_streamer tries to start the service before mongod is running - http://pulp.plan.io/issues/2435 15:57:43 2 issues left to triage: 2435, 2436 15:57:44 Pulp Issue #2435 [NEW] (unassigned) - Priority: Normal | Severity: Medium 15:57:45 systemd unit file for pulp_streamer tries to start the service before mongod is running - http://pulp.plan.io/issues/2435 15:58:24 We can't use his proposal because we can't assume that mongo is on the same machine. 15:58:42 Can we add logic in the streamer to start without needing to connect to mongodb? 15:59:16 and then only connect to mongodb when needed? 15:59:30 We can do so optionally, which I think is what the bug report provides. Wants vs. Requires 15:59:49 Is there already restart logic? 16:00:00 but it would fail to start if it couldn't connect to mongodb anyways, no? 16:00:35 Im betting/hoping that if that happens it sleeps, restarts and then works? 16:00:41 I think the mongo code retries mongodb for 30s and then gives up 16:00:46 (fyi, I didn't realize you were triaging, if you need any clarification on https://pulp.plan.io/issues/2431 16:00:48 Title: Issue #2431: Potential data corruption - Pulp (at pulp.plan.io) 16:01:11 daviddavis is right, we just tested this 16:01:23 pymongo itself provides that as the default behavior 16:01:32 so if mongo doesn't start 30s after pulp_streamer, pulp_streamer just dies 16:02:11 i think the fix should be to allow users to lengthen that retry amount in their conf 16:02:30 is there a way to move that db connection out of the startup logic and into another section of the code, so the service starts, then tries connecting after running? 16:02:48 pcreech, maybe? 16:02:49 so the service itself doesn't die, but it logs errors 16:02:50 as a user, I can increase the retry connection time of the pulp_streamer 16:03:07 we can open this but I don't think it's going to be fixed in the 2.y line 16:03:10 There is a workaround here for most users at least, that they can use their systemd unit file the way he suggests. 16:03:24 and in 3.0 we have another bug tracking this functionality for postgresql 16:03:29 maybe we update the docs then? 16:03:31 Since most users do have single-machine deployments. 16:03:39 +1 to that 16:03:52 bmbouter, adding a config setting to increase that max retry would be pretty easy right? 16:04:15 i guess not necessary with the workaround 16:04:23 im fine with docs update 16:04:52 !suggest document workaround 16:04:52 #idea document workaround 16:05:01 !propose accept 16:05:01 #idea Proposed for #2435: Leave the issue as-is, accepting its current state. 16:05:02 Proposed for #2435: Leave the issue as-is, accepting its current state. 16:05:07 +1 16:05:10 the max timeout addition wouldn't be that hard 16:05:14 +1 16:05:18 to accepting 16:05:19 and we'll be doing just that for postgresql anyway 16:05:21 +1 to accepting 16:05:25 !accept 16:05:25 #agreed Leave the issue as-is, accepting its current state. 16:05:25 Current proposal accepted: Leave the issue as-is, accepting its current state. 16:05:26 1 issues left to triage: 2436 16:05:27 #topic SELinux denial prevents user login - http://pulp.plan.io/issues/2436 16:05:27 Pulp Issue #2436 [NEW] (unassigned) - Priority: Normal | Severity: High 16:05:28 SELinux denial prevents user login - http://pulp.plan.io/issues/2436 16:05:31 The workaround isn't really a workaround, I think. It's the solution... 16:05:50 Nothing actionable in that thought, just saying it out loud 16:06:03 smyers, because of multi-machine deploys, I think it's not the solution. 16:06:38 I'm saying that putting a custom systemd unit file on the system to match your deployment is the solution. Are you saying it isn't? 16:06:38 high/high? 16:06:43 If we don't block on db connections, but handle them in another part of the code gracefully, then we won't need to be dependent on them for startup, and can recover easier... but /rant 16:06:44 do we need want to go back mhrivnak? 16:06:51 I guess we should. 16:07:02 !issue 2435 16:07:03 #topic systemd unit file for pulp_streamer tries to start the service before mongod is running - http://pulp.plan.io/issues/2435 16:07:03 Pulp Issue #2435 [NEW] (unassigned) - Priority: Normal | Severity: Medium 16:07:04 systemd unit file for pulp_streamer tries to start the service before mongod is running - http://pulp.plan.io/issues/2435 16:07:13 I'm just saying...that's not a workaround. That's the actual solution to the bug. Same outcome, in that we should document it. 16:07:29 smyers, ok gotcha. I agree that the user should do that for their single-machine deploy, but that still leaves a gap for users who run mongo on a different machine. 16:07:31 but in a clustered install or when mongod is on a machine other than the service 16:08:31 and regarding increasing/modifying the timeout, really that won't solve it either what the users are looking for is full retry/reconnect support 16:08:48 but on the 2.y line for mongod those will likely get closed as WONTFIX 16:08:53 also pcreech agreed that the best is to gracefully handle it and wait for a connection, but it may not be worth the effort at this point. 16:09:01 Right, that's why I'm saying the docs should make this clear if they don't already, and probably include example for both upstart (ugh) and systemd. All I'm saying is that customizing systemd units for your specific needs is totally normal, and shouldn't be considered a workaround. 16:09:13 It's just...a work? 16:09:18 smyers, gotcha. 16:09:36 yeah that sounds fine and we can just doc this limitation 16:09:48 note also for 3.0 I was tracking this as https://pulp.plan.io/issues/2417 16:09:50 Title: Task #2417: Ensure all processes have initial and reconnect support for PostgreSQL - Pulp (at pulp.plan.io) 16:10:07 * bmbouter updates that to include pulp_streamer 16:10:20 same action item, different phrasing? 16:10:34 I think so. 16:10:38 !propose other Alter issue to be about documenting the need to customize local systemd units to fit the deployment needs 16:10:38 #idea Proposed for #2435: Alter issue to be about documenting the need to customize local systemd units to fit the deployment needs 16:10:39 Proposed for #2435: Alter issue to be about documenting the need to customize local systemd units to fit the deployment needs 16:10:41 ? 16:10:45 words words words 16:10:51 +1 16:10:55 +1 16:11:04 !accept 16:11:04 #agreed Alter issue to be about documenting the need to customize local systemd units to fit the deployment needs 16:11:05 Current proposal accepted: Alter issue to be about documenting the need to customize local systemd units to fit the deployment needs 16:11:05 works for me. 16:11:05 1 issues left to triage: 2436 16:11:06 #topic SELinux denial prevents user login - http://pulp.plan.io/issues/2436 16:11:06 Pulp Issue #2436 [NEW] (unassigned) - Priority: Normal | Severity: High 16:11:07 SELinux denial prevents user login - http://pulp.plan.io/issues/2436 16:11:15 ugh 16:11:19 So yeah, this was fun 16:11:23 heh 16:11:40 I had a note about this in yesterday's release nots for 2.10.3, because I hadn't looked into wtf was going wrong with the el6 tests 16:11:54 smyers: is this just for el6? 16:11:56 But then I did, and hooray more selinux fun on el6 16:12:07 dkliban, el7 and fc23+fc24 looked totally normal 16:12:29 (so it's just el6) 16:12:32 gotcha 16:12:43 high/high ? 16:12:48 that's what i am thinking 16:12:59 Maybe even urgent/high blocking 2.10.3 16:13:07 I don't think we get to release 2.10.3 without this being fixed. 16:13:13 agreed. 16:13:15 yeah 16:13:16 thats probably correct 16:13:17 !propose urgent high 2.10.3 16:13:17 Error: "propose" is not a valid command. 16:13:22 +1 16:13:24 WHO WROTE THIS CRAP 16:13:28 !propose triage urgent high 2.10.3 16:13:28 #idea Proposed for #2436: Priority: Urgent, Severity: High, Target Platform Release: 2.10.3 16:13:29 Proposed for #2436: Priority: Urgent, Severity: High, Target Platform Release: 2.10.3 16:13:37 smyers: and we did not see this on 2.10.2? 16:13:43 s/Target Platform Release/Blocks Release 16:13:51 dkliban, we did not, which surprises me 16:14:13 smyers: ok ... i'll take it as assigned 16:14:16 2.10.2 received a pretty high amount of automated and manual testing, with a focus on denials in el6 16:14:45 bizhang: i'll update this issue 16:14:52 And pulp-smash absolutely catches this and shows it as a glaring flaw 16:14:52 dkliban, cool 16:15:00 !accept 16:15:00 #agreed Priority: Urgent, Severity: High, Target Platform Release: 2.10.3 16:15:00 Current proposal accepted: Priority: Urgent, Severity: High, Target Platform Release: 2.10.3 16:15:01 No issues to triage. 16:15:05 (which is how I caught it) 16:15:08 !end 16:15:08 #endmeeting