15:03:47 #startmeeting Neutron Meeting 15:03:47 Meeting started Fri Sep 18 15:03:47 2015 UTC. The chair is edwarnicke. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:03:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:47 The meeting name has been set to 'neutron_meeting' 15:04:00 #topic Rollcall 15:04:07 Please #info in 15:04:09 #info edwarnicke 15:05:08 #info asomya 15:05:47 welcome asomya , anyone else? 15:06:20 #info sangeeta 15:06:57 Last call for rollcall :) 15:07:06 #info vthapar 15:07:22 No regxboi ? 15:08:23 alagalah: regXboi could not make it this morning 15:08:36 alagalah: I've been working all morning to get cantancerous enough to fill his shoes ;) 15:08:44 #topic agenda bashing 15:09:10 #link https://wiki.opendaylight.org/view/NeutronNorthbound:Meetings#Agenda_for_Next_Meeting_.289.2F11.29 <- best I've got for agenda 15:09:43 edwarnicke, We all have aspirations... 15:10:19 Guys... the agenda there is a bit... outdated... so it would be helpful here if you could #info in the stuff you currently have to talk about 15:11:14 stuff that's not on the agenda or info in stuff already on the agenda? 15:11:32 Both 15:11:41 #info ML2 ODL driver rewrite 15:12:31 #info BGPVPN code in Neutron 15:12:53 Any outstanding patches or bugs we need to address? 15:12:57 not familiar with how to do link, https://git.opendaylight.org/gerrit/#/c/26711/ 15:13:39 #link https://git.opendaylight.org/gerrit/#/q/project:neutron+status:open <- list of outstanding patches 15:14:14 So I was sort of thinking we might do the agenda this way if folks are OK with it, three topics: 15:14:21 (not necessarily in this order) 15:14:28 a) ML2 ODL driver rewrite 15:14:31 b) BGPVPN 15:14:37 c) Other stray patches or bugs 15:14:48 (now is the time to speak up if you have other big rocks to talk about ;) ) 15:16:03 OK... so I'm not hearing any others 15:16:10 Any objections to taking these topics in this order? 15:16:23 the agenda is fine to me 15:16:54 Cool 15:17:00 although would prefer you start with BGVPN if possible 15:17:17 john_a_joyce: I'm fine with that, vthapar are you OK with going first? 15:17:32 edwarnicke, yep. fine with me. 15:18:01 #topic BGPVPN 15:18:08 vthapar: You have the floor :) 15:18:29 I've addressed all the review comments on it and then some fixes based on testing I've done. awaiting further review comments or +1/+2 15:19:04 vthapar: Could you #link in the patch here in context? 15:19:15 #link https://git.opendaylight.org/gerrit/#/c/26711/ 15:20:26 vthapar: Question... is seems very very weird that 'route distinguishers' is described as a list but represented as a string 15:21:18 edwarnicke: it is a list of strings. 15:21:43 vthapar: But its type is string in the yang 15:22:31 edwarnicke: leaf-list with type string. will that not result in list of strings? 15:23:19 #link https://review.openstack.org/#/c/177740/ --> spec for bgpvpn in openstack off which yang is based. 15:24:27 #link https://review.openstack.org/#/c/177740/32/specs/liberty/bgpvpn.rst <- the file with the proposal 15:24:37 vthapar: I'm not seeing yang for route distinguishers there though 15:25:52 edwarnicke: line 307 15:26:18 and 282 15:27:47 vthapar: Sure, but its showing as a *list* there, not a string 15:28:11 vthapar: Do you mind if we take this discussion to the patch, so that the meeting can move forward with other things? 15:28:26 edwarnicke: sure. 15:28:33 btw, relevant section: route_distinguishers,list(str),RW admin only,None,List of valid route-distinguisher strings (see below),(if this parameter is specified) one of these RDs will be used to advertize VPN routes 15:29:29 vthapar: Yep, I saw that :) 15:29:42 vthapar: Anything else on the BGPVPN stuff? 15:29:52 edwarnicke: not for now :) 15:30:18 #topic ML2 driver rewrite 15:30:38 john_a_joyce: I think this is you, correct? ^^^^ 15:30:54 yes 15:31:08 john_a_joyce: The floor is yours :) 15:31:13 plus a few others from my team :-) 15:31:34 so we posted the changes 15:32:02 https://review.openstack.org/#/c/222409/ 15:32:22 there has not been a lot of comments on it 15:32:43 but we did address some of the questions that came up last week 15:32:59 especially around the dependent objects 15:33:20 john_a_joyce: What would the impact on neutron northbound at ODL be around this? 15:33:22 Does anyone have any concerns or other thoughts? 15:33:41 Per the current design there would be no impact 15:33:47 BUTTT 15:33:54 No. at least for first phase. 15:34:00 we talked last week about adding a sequence number 15:34:01 Also, i sent out an email listing details of the architecture and the plans going forward, there has been some critique of it but very little 15:34:09 as a way to track if we are in sync 15:34:30 asomya: Could you #link in the email? 15:34:42 that is highly desirable, but that would change the northbound and the API signature itslef 15:35:03 edwarnicke: how do i do that? :) 15:35:07 john_a_joyce: Question on the sequence number, I presume its genereated on the OS side and monotomic, correct? 15:35:23 yes 15:35:28 john_a_joyce: eventually we need to enhance the protocol. can you share your ideas? 15:35:47 john_a_joyce: And as I recall you were going to keep a sequence number per instance of an object, correct? 15:35:51 seqnum should be so. 15:36:17 edwarnicke: A sequence number per journal row 15:36:35 asomya: Could you expand on 'journal row' ? :) 15:36:38 Our original thought was sequence number per transaction to be sent 15:36:57 so each state change in an object is a journal row, create, update, delete etc. for networks, subnet and ports 15:37:11 or journal row 15:37:40 so if we record the state changes in a sequence we can simply replay the entire sequence on restart. Thsi does lead to more transactions but reduces code complexity awhole lot 15:38:19 alternately, if people have major concerns on the number of transactions between neutron — odl then we can explore sequence numbers oer neutron object 15:38:45 sure we can explore - but we should be sure it is required 15:39:23 if we get the design proper - then out of sync should only occur on boundary conditions like initial install 15:39:44 asomya: Ah 15:39:47 asomya: That is smart 15:39:49 or perhaps HA events 15:40:03 asomya: So you are effectively keeping a journal of transactions 15:40:09 edwarnicke: correct 15:40:25 asomya: And then you only need to check one global seqnum to find out which transactions need to be replayed, correct? 15:40:40 edwarnicke: absolutely, the same code can be reused for initial sync 15:41:10 that is the idea, but we need an indication that we need a replay and from what starting sequence number 15:41:41 john_a_joyce: How does each journal row track what needs to be replayed? 15:42:05 edwarnicke: each journal row has an associated state 'pending' indicates it needs to be synced 15:42:42 and ti carries the object data for that transaction 15:42:51 asomya: OK 15:43:26 if we throw in sequence numbers we can simply mark everything from one particula number onwards pending and replay it from then on 15:43:26 the seq # would be replaying objects that never succeeded and we stopped retrying them 15:43:41 Cool 15:43:51 and that is where we need help form the ODL northbound 15:44:00 So are seqnum global across all neutron stuff then, or just ML2 objects? 15:44:05 an indication of what sequence number to start replaying from 15:44:09 (keep in mind we also need to handle the l3 stuff etc) 15:44:36 we can mark the row temp-error, then move on to next row. 15:44:37 edwarnicke: +1 (to other stuff than just ml2) 15:44:40 #link https://lists.opendaylight.org/pipermail/neutron-dev/2015-September/000368.html 15:44:47 We were hoping to get more traction and acceptance on the ML2 changes before starting the L3 changes 15:44:54 Probably more sophisticated way is necessary to be robust.. 15:45:18 yamahata: how does that help? 15:45:45 we will not leave somethign pending forever 15:45:52 After error, we would like to retry those later. 15:45:53 so we were already planning to do that 15:46:10 yes - we will do that 15:46:12 retry several times, then fall in real error. something like that. 15:46:13 yamahata: It does move on to the next row in order to prevent the thread from working on a single journal row indefinitely 15:46:25 mark it pending - retry X times - mark it failed 15:46:34 asomya: cool. 15:46:34 yamahata: After a configurable retry count it marks the journal row failed 15:46:42 So guys, on the ODL side (at the risk of doing live architecture) how about we just keep something like this: 15:46:47 https://www.irccloud.com/pastebin/F707Btfk/ 15:46:58 (on the ODL side) 15:47:35 That way you could keep a seqnum-entry with name 'ML2' and if you needed to, a separate one with name 'L3' or yet another one with name 'BGPVPN' etc 15:47:43 Thoughts? 15:48:22 edwarnicke: it would help. 15:48:27 It seems fine to me 15:48:33 sounds good 15:48:40 vthapar: https://git.opendaylight.org/gerrit/#/c/26711 <- new comments 15:48:49 edwarnicke: ack. 15:48:55 Would anyone like to pick up doing that on the ODL side? 15:48:56 btw let me annouce a bit 15:49:02 yamahata: please :) 15:49:05 I created a wiki page to accumulate info links 15:49:06 but the sequence number approach doesn't help unless we have a communication path between ODL and Neutron to get the latest seq number ODL got 15:49:11 https://wiki.opendaylight.org/view/NeutronNorthbound:NeutronDriverOverhaul 15:49:13 yamahata: Thank you *so* much for doing that :) 15:49:19 or a Seq number to start replaying 15:49:25 also a first patch for test framework 15:49:30 https://review.openstack.org/#/c/225037/ 15:49:40 john_a_joyce: The nice thing about that model 15:49:43 a first try to optimize full sync 15:49:47 https://review.openstack.org/#/c/222409/ 15:49:48 john_a_joyce: If ODL has a northbound to grab that sequence number then we can check it periodically and also check it on driver initialization 15:49:50 Is that you can read the seqnum 15:49:54 The problem with it 15:49:54 that's it from me. 15:50:16 Is that the seqnum is not set inline with the actual rest calls that you are updating 15:50:32 asomya: The second question is how the seqnum gets incremented on the ODL side 15:50:52 we can send it attached to each rest request from our side 15:50:56 Keep in mind, writes in ODL are asynch 15:51:16 asomya: Not to say we can't do that... but that would require a *lot* more changes to the ODL side model 15:51:31 edwarnicke: yeha that's what i was concerned about 15:51:38 edwarnicke: yes that is a problem unless we change the API signature to send with each transaction 15:52:00 john_a_joyce: I'm also a little intrested how your concept of transaction maps to stuff 15:52:12 Because currently, you can update (possibly in bulk) one object type at a time 15:52:16 (in ODL) 15:52:42 I was using transaction = journal row 15:53:12 Does a journal row contain multiple object types (ie, could a journal row contain updates to a network and a port?) 15:53:19 edwarnicke: Does ODL aggregate operations to automatically do a bulk update or does it have to be an explicit bulk operation call to ODL? 15:53:32 edwarnicke: Nope, the journal rows are atomic 15:53:48 asomya: Currently, when ODL receives a rest call, it turns that into a transaction against the ODL datastore 15:53:59 Journal doesn't contain multiple object types because neutron API is designed to update only one object type. 15:54:14 So each update is only for single object type 15:54:19 yamahata: That's what I expected, for exactly the reasons you stated :) 15:54:37 One possible option would be something like this: 15:54:52 REST call to update data in ODL 15:54:52 although we shoudl talk bulk someday especially if we need a big resync 15:54:54 followed by 15:54:59 REST call to update seqnum in ODL 15:54:59 but probably for another day 15:55:36 If we use transaction chains on the ODL side (which I think yamahata has a patch for) then we should be able to make that work :) 15:55:39 HMM - that is a bit scary I think 15:55:46 john_a_joyce: Say more :) 15:55:56 plus now we doubled the rest calls for each object 15:56:27 ODL neutron northbound can handle seqnum specifically. 15:56:27 well what if you get a success on the seqnum but not the data 15:56:52 or I guess you only send seq event on success of data 15:56:53 Don't write the seqnum unless you get a success on the data ;) 15:57:28 And if we use transaction chains on the ODL side, that will guarantee ordering of the transactions on our end 15:57:33 does ODL accept cookies? 15:57:46 asomya: Define what you mean by 'accept cookies' 15:57:53 On the neutron side you will not have a single thread sendign data 15:58:00 sorry dumb question, ignore that :) 15:58:32 john_a_joyce: Sure... but you do have the journal table, correct? 15:58:38 Imposing order 15:58:50 so one thread could pass data, fail seq and the other thread pass data and seq 15:58:57 so there is a hole in the seq 15:59:09 john_a_joyce: OK... so lets look at the unit of work 15:59:13 imposing order for dependent objects 15:59:27 Is the unit of work flushing a single journal entry to ODL? 15:59:35 not imposing absolute order relative to how ML2 placed things in its database 15:59:51 yes 15:59:57 single entry to ODL 15:59:57 john_a_joyce: in that case, do we need to change northbound api slightly to accept seqnum at the same time? 15:59:58 john_a_joyce: Sure, but it imposes *an* order, correct? 16:00:42 A given thread will always choose the oldest entry to operate on 16:00:45 yamahata: I would say seqnum-entry name and seqnum (so we update the correct one ;) ) 16:01:06 yamahata: Thanks for reminding me, we can just change what we accept on the NB side 16:01:07 so the order is time based - but restricted by and dependent objects 16:01:41 john_a_joyce: Cool 16:01:53 john_a_joyce: So how about this: 16:02:20 a) We adopt a model something like: https://www.irccloud.com/pastebin/F707Btfk/ in ODL 16:02:23 yamahata - changing the northbound API is certainly attractive for keeping the seqnum logic simple 16:02:49 b) We try modifying *an* object to take a (name, seqnum) pair, and have it update the seqnum as part of the same transaction as updating the data. 16:03:11 c) We make the seqnum stuff optional, so nothing existing breaks 16:03:18 If that works well, we go on to 16:03:25 d) Make other objects accept seqnum 16:03:30 Thoughts? 16:03:53 that is cool 16:04:05 Sounds very reasonable steps. 16:04:44 then we also need some logic for Neutron to be able to query the earliest sequence number with no earlier holes 16:04:47 Alright then 16:04:51 One thought: Can ODL maintain sequence numbers and send then in http replies on success? we can update the journal DB based on that number . that way any rows in the journal DB without sequence numbers will be replayed as well as well as from the sequence number requested 16:04:53 Now we need to get folks to do the work :) 16:04:59 then we have handled the replay cases 16:05:16 asomya: Possibly 16:05:34 asomya: For reasons of backward compatibility, I'd suggest we only do it for requests that *include* seqnums 16:06:19 one request, be mindful that other plugins than ML2 will also need this. so make it modular/re-usable enough to avoid/minimizing copypaste code across multiple drivers. 16:06:45 vthapar: do you mean bgpvpn? 16:06:46 vthapar: Always good to keep in mind 16:07:03 vthapar: Would you be OK though with us thinking about that between c and d above 16:07:13 vthapar: I'd like to try just one object first to work out the kinks :) 16:07:47 yamahata: bgpvpn as well as l3, l2gateway, lbaas etc and any new that may come in future. 16:08:30 edwarnicke: yep. agree that getting one working should be priority. 16:08:34 vthapar: Should be completely doable 16:08:45 So... as to getting it done 16:08:54 Shall we start with the port object? 16:09:01 (as our trial case) 16:09:15 network object is most trivial. 16:09:28 yamahata: I defer to your judgement on that :) 16:09:32 port has dependency on network, subnet and security group. 16:09:39 yamahata: Good point :) 16:09:47 yamahata: network is the core object in neutron 16:10:10 yamahata: Would you be willing to do a quick first pass on the ODL side as discussed? 16:10:25 edwarnicke: Sure will give it a shot. 16:10:43 Many thanks :) 16:11:01 john_a_joyce: When will you guys be ready on your side with network objects? 16:11:31 I am assuming with the seq logic just discussed? 16:11:46 john_a_joyce: Yes :) 16:12:04 Sure we can get on that 16:12:28 should not be to much additional - Arvind already had a seqnum placeholder 16:13:01 let me come back on exactly when - we where still getting through some minor issues 16:13:11 on the current change 16:13:24 john_a_joyce: Cool :) 16:13:29 edwarnicke, john_a_joyce: The immediate goal is to bring the existing code up to parity with the current ML2 driver, like adding in secutiry_groups etc. 16:13:35 yamahata: john_a_joyce I presume you guys can sync together as you go? 16:13:48 We already are :-) 16:13:55 :-) 16:13:57 asomya: I'm pretty sure the current driver is passing security groups (we worked really hard to make that happen) 16:14:02 :) 16:14:14 security group is already supported. 16:14:19 edwarnicke: what woudl be the venue to test this? 16:14:21 edwarnicke: correct, but i took them out temporarily in the reqeite, have to readd them :) 16:14:29 * edwarnicke is always comforted when yamahata confirms he's not crazy :) 16:14:32 yamahata's test framework? 16:14:34 *rewrite 16:14:53 asomya: LOL... curious to find out why at some point... but as long as they don't disappear on master, we are good :) 16:15:04 https://review.openstack.org/#/c/225037/ is a first shot. 16:15:13 It will be enhanced. 16:15:39 I'll upload a slide on the idea to public area. Probably on google-doc 16:15:53 then, add a link to the wiki page. 16:15:53 yamahata: Many thanks 16:15:57 I just noticed we are 15 minutes over 16:16:12 Can we call the meeting, or do we need to cover something else? 16:16:47 nothing more from me 16:17:02 none from me. 16:17:23 #endmeeting