22:01:25 #startmeeting OCI_weekly_3_1_2017 22:01:25 Meeting started Wed Mar 1 22:01:25 2017 UTC. The chair is mrunalp. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:01:25 Useful Commands: #action #agreed #help #info #idea #link #topic. 22:01:25 The meeting name has been set to 'oci_weekly_3_1_2017' 22:01:32 #chairs vbatts wking 22:02:19 mrunalp: is cyphar here? 22:02:43 #topic image update 22:02:48 mrunalp: stevvooe or vbatts? 22:03:03 crosbymichael: I'll ping stevvooe 22:03:14 stevvooe: I'm on 22:03:55 stevvooe: I closed down rc5, since that's been cut 22:03:59 stevvooe: is vbatts on? 22:04:39 https://github.com/opencontainers/image-spec/milestone/13 22:04:40 [grand-pairie]: are we all looking at something in particular? 22:04:58 mrunalp: i cant join today, its late and it has been a long week at work :( i will try to continue looking at the remaining PRs in spec and runc 22:05:06 stevvooe: #575, #580, and ... are small and just need more LGTMs 22:05:13 dqminh, No worries. Thanks! 22:05:19 stevvooe: and I've opened a PR defining ChainID 22:05:25 #link https://github.com/opencontainers/image-spec/pull/586 22:06:07 stevvooe: there's also a discussion around ref.name being unique 22:06:18 #link https://github.com/opencontainers/image-spec/issues/581 22:06:33 stevvooe: but validating annotations would be a fair amount of scope creep 22:07:14 crosbymichael: haven't we had annotations been opaque? At least in the runtime-spec 22:07:30 crosbymichael: consumers can put in whatever they want and we won't blow up by reading them 22:08:45 stevvooe: when I run into a problem, I assume I'm doing something wrong, not that there's a problem with the spec 22:08:52 stevvooe: but I'd rather discuss this after cyphar shows up 22:09:22 mrunalp: I need to mute y'all for a minute 22:09:28 mrunalp: ^ can you take notes 22:10:02 https://github.com/opencontainers/runtime-spec/milestones/1.0.0 22:10:08 #topic runtime spec 1.0 22:12:18 stephenrwalli, I want something light weight so test cases can refer to the spec 22:13:22 #action one more pass for the anchors 22:13:35 https://github.com/opencontainers/runtime-spec/pull/532 wking needs rebase 22:14:10 https://github.com/opencontainers/runtime-spec/pull/395 22:14:15 I want to be clear. The developers are in the best position to judge that the suite matches the specs. I do NOT want to burden the developers, however, with a lot of busy work so we have some perfect visible alignment. I’ve seen that time sync before now and it never ends well. 22:16:24 vishh, We want to understand the scenarios where pinning namespaces break 22:16:43 crosbymichael, It could be feature creep into the runtime 22:17:21 crosbymichael, It expands the scope 22:17:49 crosbymichael, It's not a simple task where runc can launch a container. It adds more external container mgmt. 22:18:56 vishh, Are we going to hide namespaces or let users deal with namespaces themselves 22:19:11 crosbymichael, They are exposed in the spec. One can specify which ones to join or not. 22:19:25 vishh, Agree that power users could deal with namespaces 22:19:51 vishh, bind mount pinning namespaces is pretty low level 22:20:12 crosbymichael, sharing namespaces conflicts with runc managing single container 22:21:01 vishh, Maybe document about creating own holder container 22:21:28 vishh, documenting is fine 22:23:25 mrunalp, maybe add it to the wiki 22:24:36 vishh, plan for 1.0 22:24:45 mrunalp, cut final rc and then wait 3 weeks or so 22:24:58 stevvooe, image spec isn't quite ready yet but close 22:25:31 stevvooe, hopeful for 1.0 in a month or so 22:26:04 #topic runc I/O 22:26:04 mrunalp: "next few months" <- much less non-comittal 22:26:14 stevvooe, okay :) 22:26:36 cyphar, we have 3-4 different ways to specify I/O with fds 22:27:25 crosbymichael, listen fds is needed for socket activation but conflicts with other cases 22:28:18 crosbymichael, overall I can see what the problem was I have PR that fixes it by removing chown when I/O is provided 22:28:40 cyphar, problem with removing chown is that its used today 22:28:54 crosbymichael, The caller should set it up 22:29:28 crosbymichael, code that was originally there in runc run had to be chowned manually and it got copied over 22:30:02 crosbymichael, The caller can ensure that I/O is accessible by the container 22:30:49 crosbymichael, From my example caller can set stdin/stdout to whatever you give 22:31:30 cyphar, Yeah sounds good 22:32:06 cyphar, PRs adding more features to I/O which is concerning 22:32:23 crosbymichael, Two modes 1. runc is attached 2. runc is detaches and container inherits stdio 22:32:29 There isn't a way to unify the 2 22:33:17 cyphar, For other fds we have other things (non 0,1,2) 22:33:47 cyphar, preserve-fds doesn't preserve stdio 22:34:17 cyphar, We can unify implicit stdio and listen_fds 22:34:40 crosbymichael, That's UNIX stdio forever and common way 22:34:59 cyphar, fair enough 22:35:28 crosbymichael, rootless containers looking nice 22:35:38 crosbymichael, Can we merge to avoid rebases? 22:35:57 cyphar, Two things remaining 1. cgroup mgmt 22:36:20 cyphar, 2. Tests are failing because of dumpable 22:37:40 mrunalp, I'm back 22:37:44 crosbymichael, cgroups leave to higher layers 22:37:47 wking, okay :) 22:38:11 cyphar: when crating a container, it's hard to check to see if you need to create a higher-level cgroup hierarchy yourself 22:38:31 cyphar: there's essentially a permission check missing 22:38:52 crosbymichael: I could look into the cgroups stuff, but I don't think we need a lot of complexity 22:39:04 crosbymichael: either the higher levels setup a hierarchy for you, or we fail 22:39:19 crosbymichael: we touch cgroups on our own for process accounting and tracking, so what more can you do 22:39:54 cyphar: currently processes in the libcontainer API... there is a way to fix it, but I'm cautious about it because it would require exec'ing in ... 22:40:02 crosbymichael: can you mount your own cgroup hierarchy? 22:40:25 cyphar: your new namespace will always be rooted where you were 22:40:47 cyphar: and there hasn't been much motion in the kernel to allow cgroup namespacing to change ownership 22:41:06 crosbymichael: I wonder if having a named runC cgroup would fix process tracking issues 22:41:24 crosbymichael: we'd need loose permissions to make it workable, but then it might not be useful for accounting 22:41:44 crosbymichael: I don't know how much of a self-contained thing we can make rootless containers for runC 22:41:55 cyphar: apart from that, everything else works 22:42:12 crosbymichael: I tried it out, and I had to change the root for the state, but overall it just works 22:42:29 cyphar: the dumpable stuff is the only thing blocking it from being mergable? 22:42:47 crosbymichael: the dumpable thing is... [something about star] you can't exec? 22:42:50 cyphar: yeah 22:43:07 cyphar: there's no limitation on creating a network namespace, but you can't connect that namespace to the internet 22:43:26 cyphar: someone else can do that for you 22:44:02 crosbymichael: I feel like a lot of this rootless stuff is dependent on a more-privileged user facilitating networking, cgroups, etc. and that's what most other systems do 22:44:15 mrunalp: you could have small, auditable setuid helpers 22:44:36 crosbymichael: but that doesn't help with the academic "we can't install any setuid stuff" use-case 22:45:10 cyphar: I'm sure there will be something that allows users to create their own cgroups. And we'll see what happens for network namespaces 22:45:34 mrunalp: I heard somthing about [...] last year 22:45:48 unprivileged network namespace setup 22:45:56 crosbymichael: we can still be useful for "I just need to install this package". It's still useful without cgroups and a network namespace 22:46:16 mrunalp: there's also a console-socket race? 22:46:19 crosbymichael: yeah 22:46:42 cyphar: the reason why we have two sockets is that we don't want to expose the internal synchronization socket 22:47:01 cyphar: the solution might be blocking somewhere... There's probably a good fix for this 22:47:18 crosbymichael: using the sync socket to also pass the console socket? 22:47:34 crosbymichael: you could say "don't send it down the sync socket, send it down this one" 22:47:40 cyphar: maybe. I'd have to look at it 22:47:50 crosbymichael: it seems like an easy way to get the blocking to work 22:48:00 crosbymichael: I can work up a patch if you're ok with that approach 22:48:02 cyphar: sounds good 22:48:19 cyphar: we also need to figure out the console socket API 22:48:30 mrunalp: anything else? 22:48:31 do I have everyone on ignore accidentally or something? 22:48:36 stevvooe: ref naming? 22:48:49 cyphar: sure 22:48:56 #topic unique ref names 22:49:06 erikh: can you hear us? 22:49:12 or can we not hear you? 22:49:16 #link https://github.com/opencontainers/image-spec/issues/581 22:49:27 oh you guys are doing a hangout or osmething 22:49:33 sorry, late to the party. sorry to interrupt. 22:49:38 cyphar: with index.json, ref [names] have become a second-class citizen 22:49:42 stevvooe: that's not true 22:49:51 cyphar: as a user trying to access something with a ref of $X 22:50:10 cyphar: with non-unique refs, you could have multiple refs that match 22:50:34 cyphar: if the program says "go away, give me a name that only has one ref", that's one thing, but seems like a poor choice 22:50:56 cyphar: or you can return a set of all matching refs, and then the ref system doesn't seem like a complete solution 22:51:08 cyphar: you'd need an out-of-spec way to select which of those matching refs you want 22:51:17 stevvooe: your model of the problem does not exist 22:51:34 select * from descriptors where match(myplatform, platform); 22:51:35 stevvooe: asking for a ref by name does not exist in OCI or anything around it 22:51:51 stevvooe: you'd need a centralized naming system, or Notary, or something like that 22:51:52 select * from descriptors where match(myplatform, platform) and name = "foo"; 22:52:08 stevvooe: I don't see why it has to be unique. 22:52:25 stevvooe: there's no given value for it being unique 22:52:43 stevvooe: the data structure does not provide uniqueness, it provides annotations 22:52:59 stevvooe: the idea that this will create incompatibilities is absolutely rediculous 22:53:06 stevvooe: create a UX to help users make the choice 22:53:13 stevvooe: why is this a spec problem? 22:53:34 cyphar: I understand your arguments, and that queries are a better way to handle the current situation 22:53:48 cyphar: the incompatibilies are not that you couldn't pass an image between implementations 22:54:04 cyphar: the incompat is, if you have an image, and you want to plug it into Docker, cri-o, etc. 22:54:26 cyphar: while all of these implementations will provide a way for parsing the refs, the UX will be different in each case 22:54:49 cyphar: I'm not saying we need a global aggreement. But as it stands, there is no way to refer to a single ref in the image 22:55:05 cyphar: this is the first problem users will have when they try to use an image 22:55:13 stevvooe: is Docker doing it differently from everyone else? 22:55:23 stevvooe: the current implemenation is by a skopeo maintainer 22:56:14 stevvooe: going back to schema1, the tag was part of the content addressable blob 22:56:39 stevvooe: a normalized data structure would have these pointers in an area where you could enforce uniqueness 22:57:13 stevvooe: once you move your UX away from the datastructure, you get more complexity 22:57:47 stevvooe: if we made the name unique, we make this one case simpler, but lots of other cases more complex 22:58:11 cyphar: I must be missing something 22:58:37 cyphar: if you have an index.json, and a single tag referencing one index, and that index references multiple manifests, that's one way to get a platform-independent images 22:58:40 stevvooe: sure 22:58:51 cyphar: so what do you gain by allowing repeated tags? 22:59:06 stevvooe: once you have an index referencing indexes, you open up lots of edge cases 22:59:20 stevvooe: if indexes could only reference manifests, the tree depth was limited 22:59:42 stevvooe: you need to be able to traverse that tree, assemble all reference descriptors, and then list the gathered set 22:59:52 stevvooe: it's convenient to flatten that tree 23:00:02 stevvooe: with a tuple set like this for two platforms 23:00:12 (mediatype, digest, "osx", "foo") 23:00:12 (mediatype, digest, "windows, "foo") 23:00:27 stevvooe: with two entries with the tag "foo" 23:00:51 (mediatype, digest, null, "foo") -> 23:00:51 (mediatype, digest, "windows, null) 23:00:51 (mediatype, digest, "osx", null) 23:01:02 stevvooe: to tools, these should be identical 23:01:11 stevvooe: there's no point in limiting these 23:01:45 stevvooe: the spec does not need to tell you how to walk the tree and collect or handle refs 23:02:00 stevvooe: Docker can say "this is not unique. Here's what I'm seeing, now give me more info" 23:02:05 stevvooe: other tools can just list them all 23:02:20 stevvooe: I think we need to stick to the given datastructure 23:02:48 stevvooe: the erroring-out approach will make your tool less compatible. So there are implications to the choices, but they don't need to be encoded in the spec (they are use-case dependent) 23:03:17 cyphar: that's all fine. If it's ok for annotations/selections to be incompatible between implementations, then that's fine 23:03:34 stevvooe: I don't think they will be incompatible, because there's no incentive for making incompatible images 23:03:55 cyphar: I'm referring to the way the user accesses an entry. 23:04:08 cyphar: but I think there should be implementer notes addressing this 23:04:27 stevvooe: make a bug and assign it to me, and I'll make some notes for platform matching and address compatibility 23:04:30 cyphar: sounds good 23:04:49 stevvooe: I have some incoming code in containerd for this, and I'll CC cyphar 23:04:53 cyphar: sounds good 23:04:58 #endmeeting