07:07:25 <joehuang1> #startmeeting multisite 07:07:25 <collabot> Meeting started Thu Sep 29 07:07:25 2016 UTC. The chair is joehuang1. Information about MeetBot at http://wiki.debian.org/MeetBot. 07:07:25 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic. 07:07:25 <collabot> The meeting name has been set to 'multisite' 07:07:51 <sorantis> #info dimitri 07:07:56 <SAshish> #info Ashish 07:07:56 <joehuang1> #info joehuang 07:08:08 <joehuang1> #topic resource sync 07:08:11 <sorantis> before we move to the topics 07:08:23 <sorantis> I’ve created a stable branch for kb 07:08:29 <sorantis> today i will make a relesae 07:08:45 <joehuang1> good, thank you dimitri 07:08:46 <sorantis> since our code for quota management is stable, i think it’s fine to do so 07:08:53 <joehuang1> yes 07:09:01 <SAshish> yeah 07:09:13 <joehuang1> how about to have the release in parallel with OpenStack Newton? 07:09:24 <joehuang1> we can have stable branch now 07:09:33 <SAshish> we do we have anything planned for colorado 2.0? 07:09:40 <SAshish> or 3.0 07:09:49 <sorantis> you mean for multisite? 07:09:50 <joehuang1> Newton release is not published yet 07:10:13 <joehuang1> from my point of view, no new for colorado 2.0/3.0 07:10:31 <joehuang1> do you want to add something to colorado2.0/3.0? 07:10:49 <SAshish> its too short time, we cannot 07:11:07 <joehuang1> for kingbird, keep the release date aligned with openstack Newton release date 07:11:52 <SAshish> may we if we want we can improve the documentation inside each module for 2.0/3.0 07:11:53 <joehuang1> Multisite branch and release tag already done 07:12:31 <SAshish> just the readme files inside drivers/api/common/engine 07:12:33 <joehuang1> sure if we have update in multisite documentation, we can have later release 2.0/3,0 07:13:06 <joehuang1> the files you mentioned is in Kingbird repo. 07:13:15 <joehuang1> it's released with KB 07:14:09 <SAshish> oh, yeah. so tagging for colorado 1.0 is based on which commit/date? 07:14:53 <joehuang1> tagging in colorado release is for multi-site repo, it contains some documentation 07:15:45 <joehuang1> https://git.opnfv.org/cgit/multisite/log/?h=stable/colorado 07:15:47 <SAshish> documentation + install script 07:15:56 <joehuang1> yes 07:16:56 <joehuang1> o, BTW, we'll have one week holiday here in China next week, so next weekly meeting will be cancelled 07:17:32 <SAshish> holiday for? 07:17:47 <joehuang1> National day 07:18:18 <SAshish> nice 07:18:24 <joehuang1> Thank you 07:18:37 <sorantis> +1 07:19:21 <joehuang1> to Dimitri, how do you think about the release date? It's little weird that kb release before Nova/Cinder has Newton release 07:19:45 <sorantis> I’ll make it so the release date matches 07:19:57 <sorantis> I’m just making all preparations for it 07:19:57 <joehuang1> great 07:20:07 <sorantis> oh, btw 07:20:07 <joehuang1> good 07:20:09 <sorantis> https://build.opnfv.org/ci/view/multisite/job/multisite-kingbird-daily-master/ 07:20:26 <sorantis> blue builds finally 07:21:05 <SAshish> yeah. finally 07:21:09 <joehuang1> thank you very much to remove the health check 07:21:16 <joehuang1> cheer! 07:21:55 <SAshish> there is serious issue with ovs on that lab 07:22:07 <joehuang1> #info kb and openstack release date matches 07:22:21 <joehuang1> #info next weekly meeting will be cancelled 07:22:22 <jose_lausuch> sorry for hijacking this meeting 07:22:26 <jose_lausuch> talking about blue builds 07:22:29 <jose_lausuch> https://build.opnfv.org/ci/view/multisite/job/functest-fuel-virtual-suite-master/lastBuild/console 07:22:48 <jose_lausuch> I see the same error in multisite test case as we had for healthcheck 07:22:55 <jose_lausuch> but I don't know why it returns blue :) 07:23:11 <jose_lausuch> openstack_utils - ERROR - Error [create_tenant(keystone_client, 'tempest', 'Tenant for Tempest test suite')]: Conflict occurred attempting to store project - it is not permitted to have two projects within a domain with the same name : tempest (HTTP 409) 07:24:11 <sorantis> this we can manually clean up 07:24:32 <sorantis> the problem is that auto clean-up somehow doesn’t trigger every time 07:24:39 <joehuang1> duplicated entry 07:24:46 <sorantis> so we are stuck with such resources 07:24:51 <jose_lausuch> ah ok 07:25:00 <jose_lausuch> let me know if you need any help with this 07:25:12 <joehuang1> will do. Thank you jose 07:25:18 <sorantis> I’ll try to remove the tenant now 07:25:20 <jose_lausuch> thanks and sorry again 07:25:29 <sorantis> if it repeats, we will need some help :) 07:26:02 <joehuang1> so, let's back to resource sync topic? 07:26:04 <sorantis> fb98906298ed4ffc8db86b0cc06c2f5a | tempest 07:26:07 <sorantis> it’s still there 07:26:14 <SAshish> scenarios passed 07:26:26 <SAshish> let us remove that for now 07:26:46 <sorantis> opnfv-tenant1 07:26:48 <sorantis> opnfv-tenant2 07:26:50 <sorantis> what about these? 07:27:06 <SAshish> yeah, can remove those as well. healthcheck used to create those 07:27:27 <SAshish> I will clean that lab 07:27:40 <joehuang1> just manually remove the entry 07:27:50 <joehuang1> thank you Ashish 07:27:59 <sorantis> ok done 07:28:59 <joehuang1> thank you two :) 07:29:38 <joehuang1> I am in the office, not able to access the lab and google doc 07:31:21 <joehuang1> could you find my comment in the google doc for the resource sync. blueprint? 07:32:11 <SAshish> yes, one of your comment talks about concurrency during resouce sycning 07:32:23 <SAshish> the problems releted to concurreny 07:33:19 <SAshish> let us discuss this 07:33:36 <joehuang1> your thoughts? 07:35:31 <SAshish> I think, resource sync is an admin activity, and admin should make sure that during resource sync there should not be any creation/deletion for that resource type 07:36:28 <SAshish> and problem happens when resource is created/deleted in the region from where he is trying to sync 07:37:03 <joehuang1> you mean admin disable all tenant user's behavior in multi-region? 07:37:27 <sorantis> i disagree with that 07:37:30 <SAshish> no 07:37:56 <sorantis> I as a tenant would like to have the option to sync my ssh keys, sec groups , images, etc 07:37:59 <joehuang1> that's not good idea 07:38:08 <SAshish> do not disable 07:38:14 <sorantis> If i can manage it in one region, i should be able to manage it across 07:39:36 <joehuang1> but there many users in one tenant/project, different user may have same right on CRUD of the same resource in Multi-region. 07:39:53 <joehuang1> if they have same role 07:40:07 <joehuang1> it's not single user operation 07:40:39 <sorantis> then we need a role for it 07:41:22 <sorantis> i don’t see multiple users for the same project doing syncing 07:41:32 <SAshish> this is corner case 07:41:41 <sorantis> so if we define a role for it, then only the users for that role are allowed to sync 07:41:43 <joehuang1> one is doing sync, another one is deleting 07:41:50 <SAshish> doing sync + creation at the CRUD 07:42:03 <SAshish> doing sync + CRUD* 07:42:42 <SAshish> this may lead to inconsistency, but really can we do anything about it 07:42:51 <joehuang1> you mean in the multi-region cloud, only one role single user is allowed to SEG CRUD 07:43:10 <SAshish> that is a tough limitation 07:43:29 <sorantis> is it? 07:43:43 <sorantis> why so? 07:44:05 <joehuang1> ok if there are two user for the same role 07:44:13 <SAshish> if there are two admin roles 07:44:30 <SAshish> some projects will have more than one admin 07:45:05 <SAshish> this is even happen with quota sync as well 07:45:05 <joehuang1> one role can be mapped to multiple users, this is the relationship in Keystone 07:45:44 <joehuang1> quota sync is a little different is no tenant user will change the quota usage 07:45:48 <SAshish> after reading the usage from one region, if during calc/syncing, if there is CRUD 07:45:54 <SAshish> if VM is deleted 07:45:58 <joehuang1> program changed it 07:45:58 <SAshish> or flavor is deleted 07:46:34 <SAshish> as it is a periodic task, it will have eventual consistency 07:46:56 <SAshish> if it is mixed in one run, then would be considered in next run. 07:47:04 <SAshish> that happens with quota sync 07:47:28 <SAshish> so this is a corner case that can happen in quota sync as well 07:47:49 <joehuang1> I agreed for quota sync. eventual consistency is possible 07:48:11 <SAshish> as there is a next run scheduled in sometime 07:48:23 <SAshish> here we dnt have that, it is manual/on demand 07:48:34 <SAshish> may be 07:48:40 <SAshish> as a cross check we can 07:48:46 <SAshish> check after sycing 07:48:50 <SAshish> for consistency 07:48:52 <SAshish> cghec 07:48:54 <joehuang1> for flavor/volume type, it's admin action, and seldom action, it's also possible to get consistency 07:48:56 <SAshish> check* 07:49:12 <SAshish> after syncing, perfrom a consistency check 07:49:19 <joehuang1> but for SEG, it's tenant level resources 07:49:43 <SAshish> how about the consistency check after sync? 07:49:45 <joehuang1> and can be changed by the tenant users themself from different region 07:50:10 <SAshish> can have just one check not more than than 07:51:06 <joehuang1> how to judge deleted items while it were synced to part of other regions? 07:51:15 <sorantis> what will it give? 07:51:34 <SAshish> check the already read resouces values with the current region values, If both are same then SUCCESS or perfrom sycn for those missing resources 07:51:47 <SAshish> let me put it again 07:52:15 <SAshish> first consider there are some resouces which were added during resouce syncing 07:52:34 <SAshish> and we are sycing from region1 to region2, 3 07:52:47 <SAshish> so during sycing the new created resources are missed 07:53:01 <SAshish> so we will not have newly created resouces in region2, 3 after sync 07:53:03 <SAshish> okay 07:53:11 <SAshish> now the sync has been done, 07:53:29 <SAshish> so already we have details of resources from region 1 which was read earlier 07:53:43 <SAshish> now again we get the list of resources from region1 07:53:57 <SAshish> so before sync and after sync resources are compared 07:54:07 <SAshish> if both are same then there is nothign to do 07:54:56 <SAshish> if both are not same and after sync has more resources than before sync then we have to add the extra resources which for sure have been created during sync and we missed it in first go 07:55:02 <joehuang1> what about the other user issue sync command, at the same time in another region 07:55:27 <joehuang1> and there some SEG add and deletion? 07:56:06 <SAshish> sync for the same tenant at the same instant of time looks very rare 07:56:07 <joehuang1> and during your comparation , the data in the region1 changed 07:56:29 <SAshish> or we can make this as job 07:56:34 <SAshish> only one after another 07:56:43 <SAshish> dnt have parallel sync for same tenant 07:56:50 <SAshish> as we have for quota management 07:56:58 <joehuang1> good thinking? where to control the job 07:57:03 <SAshish> kb DB 07:57:32 <SAshish> there is already a job running, admin 2 will have to wait 07:57:51 <SAshish> so suppose 07:57:55 <SAshish> there are 2 admins 07:57:58 <sorantis> so, it’s a tenant lock 07:58:11 <SAshish> 1 is doing, from region 1 to region 2 and region 3 07:58:13 <joehuang1> sync job could be locked, but we are not able to lock the user to change the seg in region one when you issue a job in kb 07:58:18 <SAshish> yeah 07:58:25 <SAshish> tenant lock for all the regions 07:58:37 <joehuang1> you mean lock all user 07:58:42 <SAshish> no 07:58:49 <SAshish> lock the sync job for a tenant 07:59:06 <joehuang1> you mean lock all users' operation in each region? 07:59:06 <SAshish> for your thing, where user can change SEG 07:59:10 <SAshish> no 07:59:13 <SAshish> lock only sync 07:59:17 <SAshish> and for your tyhing 07:59:23 <SAshish> we will have one more level of check 07:59:37 <SAshish> after a sync if perfromed, check if there is consistency 07:59:45 <SAshish> if then perfrom for the missed ones 08:00:11 <SAshish> so logically, we give two chances for sync to become consistent 08:00:28 <joehuang1> how do you know it's the delete intention for the missed one or not 08:01:04 <SAshish> can you be more clear? 08:01:07 <SAshish> sorry didnt understand 08:01:37 <joehuang1> user can delete seg / rule in each region, it's possible, during the sync job 08:02:27 <joehuang1> user can also add/update seg / rule in each region during the sync job 08:02:28 <SAshish> yes it can happen 08:02:42 <sorantis> that’s why Ashish’s proposed two stage check 08:02:46 <sorantis> before and after 08:03:15 <SAshish> so our algorithm is 08:03:32 <SAshish> 1. sync 2. recheck if there is something missed 3. if missed then resync 08:03:57 <SAshish> step 2 is internally 08:04:01 <joehuang1> resync will remove the delete intention\ 08:04:11 <SAshish> yes, it will 08:04:16 <joehuang1> the user who want to delete will be angry 08:04:41 <SAshish> if it is deleted during sync, then it will be deleted after sync from rest two regiojns 08:05:01 <SAshish> as during sync, it has reached region2 and region3 08:05:07 <SAshish> which we dnt want 08:06:05 <SAshish> it is just compare old resource with the current resouce after sync 08:06:11 <joehuang1> time is up, more consideration for the con-currency is needed 08:06:56 <sorantis> ok 08:07:12 <SAshish> will document this 08:07:15 <SAshish> and send across 08:07:18 <sorantis> #info Ashish to describe the concurrency approach in the doc 08:07:22 <joehuang1> ok, thank you for attending the meeting 08:07:28 <sorantis> thanks guys 08:07:32 <sorantis> talk later 08:07:39 <SAshish> sure, thanks. good bye 08:07:40 <joehuang1> #endmeeting