#opnfv-meeting log

08:04:10 <joehuang> #startmeeting multisite
08:04:10 <collabot> Meeting started Thu Sep 24 08:04:10 2015 UTC.  The chair is joehuang. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:04:10 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
08:04:10 <collabot> The meeting name has been set to 'multisite'
08:04:29 <joehuang> #topic rollcall
08:04:49 <Tapio_T> #info Tapio Tallgren
08:04:54 <joehuang> #info joehuang
08:04:58 <sorantis> #info sorantis
08:06:01 <Xiaolong> #info Xiaolong
08:06:02 <joehuang> #topic jira ticket and sprint
08:07:44 <joehuang> could we finish review for use case 1/2/3 before the OCT 25, so that we finish the sprint 1, after that we spend 2 month for use case 4/5 review
08:08:04 <joehuang> and we are now discussing use case 4 and later use case 5
08:08:46 <joehuang> there are use case 1/3 in review/approve process in gerrit
08:09:14 <joehuang> and because Colin has some issue in access gira, I will help him prepare use case 2 for review and approve
08:09:54 <sorantis> I’ll review the commits asap
08:09:59 <joehuang> thanks
08:11:04 <joehuang> colin told me he is not able to attend today meeting
08:11:28 <sorantis> ok
08:11:31 <sorantis> let’s follow the agenda
08:11:46 <joehuang> now let's talk about the use case 4
08:11:56 <joehuang> #topic use case 4 discussion
08:12:53 <sorantis> I general, what I can say about UC4 is that although the central topic is the same - syncrhonization, the methods of doing it for each resource type are different
08:13:33 <joehuang> #link https://etherpad.opnfv.org/p/multisite_usecase_collection
08:14:25 <sorantis> First, we need to identify what exactly needs to be synced
08:14:39 <joehuang> the biggest challenge is centralized quota/ip address space management
08:15:03 <sorantis> let’s treet them separately
08:15:16 <sorantis> quota has nothing to do with ip management
08:15:20 <joehuang> we also hope that a centralized service to provide the global resources view for tenant/admin
08:15:36 <sorantis> shall we start with quota?
08:15:52 <joehuang> if we only process one of them, it could be seperated
08:16:14 <sorantis> I’ve written a solution proposal for the quota use case
08:16:25 <joehuang> yes
08:16:34 <sorantis> hope you have had the time to read it
08:16:38 <joehuang> #link https://etherpad.opnfv.org/p/centralized_quota_management
08:16:52 <joehuang> I have read it and give comment inside the etherpad
08:17:32 <sorantis> yes, your comment is valid
08:17:51 <sorantis> that’s why this aproach guarantees eventual consistency
08:17:53 <joehuang> to sorantis, the proposal 1 is not recommended from you.
08:19:02 <sorantis> eventually the quota limits will be set so, that if exceeded, no service will allow for further provisioning
08:19:52 <joehuang> with some over commit there
08:19:53 <sorantis> the only case that leads to inconsistency is when kingbird write the new quota limits and at the same time a resource is being allocated
08:20:12 <joehuang> correct, but it may happen
08:20:22 <joehuang> and not controllable
08:20:38 <sorantis> correct
08:21:32 <joehuang> xiaolong, what's your idea about use case 4 and quota management?
08:21:42 <joehuang> and tapio?
08:21:55 <Xiaolong> i took only a quick look at the proposal
08:22:04 <Xiaolong> not sure to fully understand yet
08:22:15 <Xiaolong> from my view
08:22:17 <joehuang> so we can wait a moment
08:22:37 <Tapio_T> Alternative #2 makes sense to me
08:22:37 <Xiaolong> i think we need a synchronized database
08:22:57 <sorantis> for all services?
08:23:00 <Xiaolong> between the multiple instances
08:23:17 <Xiaolong> to keep the quota and consumation information
08:23:29 <sorantis> but quota information is spread across many services
08:23:36 <sorantis> there’s no central database for it
08:23:49 <Xiaolong> before each allocation, a transaction should garantee the consistancy
08:24:00 <sorantis> nova has own quota management, cinder its own, neutron
08:24:10 <Xiaolong> verify the limit, allocate resources, update information
08:24:30 <Xiaolong> yes, we need a new module
08:24:47 <sorantis> the problem is that even nova quota goes out of sync at some point
08:24:56 <Xiaolong> who retrieve info from nova, cinder, neutron and keep them sync and updated
08:25:02 <sorantis> so CERN for example has written some scripts to sync it manually
08:25:20 <sorantis> yes, this module is described in the proposal
08:26:05 <sorantis> #info http://openstack-in-production.blogspot.se/2015/03/nova-quota-usage-synchronization.html
08:26:42 <joehuang> only nova is not enough
08:26:54 <sorantis> this in not the point
08:27:12 <sorantis> the point is that quota will eventually go out of sync in OpenStack itself
08:27:30 <sorantis> because it’s quite buggy
08:27:58 <sorantis> so there’s anyway a need to synchronise quota usages, even in case of one region
08:28:04 <Xiaolong> if nova can't do its job well, it's nova's responsability to improve it. we can focus on the new module, and inform openstack guys the bugs
08:31:06 <Tapio_T> The nova-quota-sync tool mentioned the the blog post looks interesting
08:31:35 <sorantis> it’s basically a hack, to manually calculate resource usages and update the quota database
08:31:56 <sorantis> this should run in each region in all cases, untill nova fixes bugs in their quota implementation
08:32:28 <Tapio_T> What more do you need, besides a distributed database?
08:32:49 <Xiaolong> a transactional control framework
08:32:59 <Xiaolong> to garantee the consistency
08:33:10 <joehuang> what's the frequence to sync the quota limit in proposal 2?
08:33:37 <joehuang> agree to xiaolong, we need a consistency quota management
08:33:58 <sorantis> this transactional framework will lead to changing scheduler and quota implementation in nova, cinder, neutron
08:34:13 <sorantis> joehuang, it’s up to the cloud admin
08:34:38 <sorantis> this can be set to whatever value is appropriate
08:34:49 <sorantis> for some clouds it can be hours
08:34:53 <sorantis> for some a few seconds
08:35:03 <sorantis> it’s a configuration parameter
08:35:12 <joehuang> hours is not acurate for quato control
08:35:27 <joehuang> seconds is a traffic burden for each region
08:35:30 <sorantis> it can be if there are very few resource allocations going around
08:36:29 <sorantis> you’ll have traffic burdain in all cases if you want to keep the regions in sync
08:36:41 <sorantis> be it a database replication or just API calls
08:36:48 <joehuang> so it's not production friendly
08:37:08 <joehuang> currently cascading mentioned in https://etherpad.opnfv.org/p/multisite_gap_analysis can handle these things very well not only quota managment, but also for other issue for example IP address management, global resource view, ssh key replication etc
08:37:26 <sorantis> cascading has a big impact on the openstack codebase
08:37:56 <joehuang> a seperate layer to do the quota control and ip address space management
08:38:31 <joehuang> resource usage view is a close issue for quota management
08:38:38 <sorantis> there are service proxies to ensure that separate layer as far as I understand
08:38:49 <joehuang> yes
08:39:07 <sorantis> and there are also some alteration of the existing services (nova, neutron, ceilometer)
08:39:11 <sorantis> is that correct?
08:39:26 <joehuang> no, same openstack api exposed
08:40:00 <sorantis> no, I’m not talking about the api, of course it should be the same api
08:40:27 <joehuang> openstack itself as the backend of OpenStack
08:40:44 <sorantis> yes, yes
08:40:51 <sorantis> I’m talking about the integration point
08:41:02 <Tapio_T> Sorry, I am a slow typist. We need to be able to change quotas transactionally in nova, neutron etc. Is that gap recorded somewhere?
08:41:48 <joehuang> currently we just talked about the pros. cons of proposal https://etherpad.opnfv.org/p/centralized_quota_management
08:42:23 <sorantis> what I mean is the patches that Cascading has for the existing openstack services
08:42:47 <sorantis> this https://github.com/stackforge/tricircle/tree/stable/fortest/juno-patches
08:44:09 <joehuang> yes. for juno version we need patches
08:44:28 <joehuang> but all the cascading is in refectory for new design
08:44:32 <joehuang> and just started
08:44:35 <sorantis> My guees is that you’ll need some patches for kilo, liverty, etc?
08:45:05 <joehuang> #link https://docs.google.com/document/d/19BXf0RhkH8wEEymE2eHHqoDZ67gnzgvpr3atk4qwdGs/edit
08:45:51 <joehuang> try to remove patches requirements for the new design
08:46:16 <joehuang> and decoupled with current nova/cinder/neutron as much as possible
08:46:49 <sorantis> ok
08:47:08 <joehuang> cascading is one option, open discussion is encouraged
08:48:27 <sorantis> good
08:48:35 <joehuang> #link https://wiki.openstack.org/wiki/Tricircle
08:48:42 <sorantis> so I’d like to hear more proposal on quota management
08:48:53 <sorantis> we have two at the moment
08:49:53 <sorantis> firstly, let’s identify why we need to sync quotas at all
08:50:12 <sorantis> many public clouds have open quota policy - you pay as you use
08:50:45 <sorantis> do we take into account such case?
08:51:02 <joehuang> understand
08:53:16 <joehuang> I also have another idea to have a standalone service for distributed cloud, used for post control for quota , but group resource view and proactive on demand replication for ssh keys/ image/seg
08:54:27 <joehuang> but service provisioning for VM/volume/network will be directly called to each region seperately
08:55:41 <sorantis> ok, that’s what I’m also aiming for. Have a service working aside, quite transparent, which has zero impact on the openstack codebase
08:55:47 <joehuang> even for ceilometer part ( use case 5 ), view will be generated with task maner and collect information on demand
08:57:37 <joehuang> the centralized service will collect usage from each region periodicly, and send alarm to tenant if the quota is execeeded
08:57:52 <joehuang> this will be post control
08:58:31 <sorantis> without any action taken?
08:59:03 <joehuang> if need, then your proposal is a good compliment
09:00:06 <joehuang> I think we need more thinking and discussion on use case 4/5, let's continue next meeting
09:00:41 <joehuang> and please review the commit for use case 1/2/3
09:01:09 <joehuang> thank you all for attending the meeting, see you next time
09:01:20 <sorantis> good. will you have time to describe your idea on etherpad before the next meeting ?
09:01:25 <Tapio_T> Thank you!
09:01:30 <joehuang> I'll
09:01:33 <sorantis> thanks, bye!
09:01:36 <joehuang> bye
09:01:43 <joehuang> #endmeeting