#opnfv-meeting log

08:01:48 <joehuang> #startmeeting multisite
08:01:48 <collabot> Meeting started Thu Jul 16 08:01:48 2015 UTC.  The chair is joehuang. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:01:48 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
08:01:48 <collabot> The meeting name has been set to 'multisite'
08:01:55 <Malla> OK sorry, good afternoon thn.. :)
08:02:26 <joehuang> when will you start your vacation?
08:02:33 <colintd_> Greetings multisiters
08:02:45 <zhipeng> greetings !
08:02:46 <joehuang> nice to see you again
08:03:22 <joehuang> #info rollcall
08:03:35 <joehuang> #info joehuang
08:03:53 <colintd_> #info colintd
08:04:47 <zhipeng> joehuang, it should be #topic rollcall :P
08:04:54 <zhipeng> #info zhipeng
08:04:55 <joehuang> sorry
08:05:01 <Malla> #info Malla
08:05:10 <zhipeng> colintd_ I'm reading your email now :)
08:05:43 <joehuang> yes, I also read the mail and slides you guys shared today
08:06:19 <joehuang> #topic use case 1 identity service management
08:06:31 <colintd_> Thanks again for the foils / diagrams.  I think trying to converge on one or a set of architecture(s) is exactly what we need to do
08:06:37 <colintd_> More offline though....
08:06:42 <joehuang> I 'll introduce the prototype briefly
08:07:13 <joehuang> The prototype is based on hafe (Hans ) docker image
08:07:28 <zhipeng> yep, i think it is not necessariliy we converge to one, but one set of should be fine colintd_
08:07:40 <zhipeng> I will leave the flour to joe now :P
08:08:09 <joehuang> only few minutes about identity, then let's dicussion the arhictecture idea
08:08:28 <joehuang> cluster works
08:09:00 <joehuang> but the async replication between cluster has not been done because some configuraiton missed
08:10:03 <joehuang> and I also think multi-master cluster plus multi-read-only slave (start mode) distributed in multisite may work for fernet token
08:10:25 <joehuang> need further prototype
08:11:46 <joehuang> So before the prototype finished, let's discuss the use case 2 requirements to OpenStack and the architecture idea.
08:11:46 <zhipeng> so what would be the impact on Multisite if the prototype succeed ?
08:12:31 <colintd_> Before leaving #1, I'd suggest we should think upfront about what behaviour we want under partition.  With multi-master systems, the partition behaviour almost always forces you down particular paths (CP/AP)
08:12:39 <joehuang> if the prototype succeed, then we can recommend to use fernet token with each site installed with KeyStone service
08:13:58 <joehuang> yes. the management network need to be established for replication
08:14:56 <joehuang> In e-commerce company, they have deployed master-mulit-read only slave DB backend
08:15:36 <zhipeng> so how the partition tolerance goes for them ?
08:16:04 <joehuang> For keystone federation, the challenge is if you add one more role, you have to change all keystone services seperatly
08:16:41 <joehuang> the synchronization configuration in multiple federated KeyStone service is also a challenge
08:16:41 <colintd_> My point is that we should be clear about what changes we want to allow if the network is partitioned, and then critically how the resulting system converges when the partition ends.
08:17:23 <colintd_> Many telcos have a very hard requirement that geographic sites should be able to operate as isolated entities to deal with earthquake. flood, fire, etc knocking out interconnects.
08:17:33 <joehuang> to colintd, what's your idea on identity service management in multisite scenario?
08:17:53 <colintd_> They normally need to ability to make changes in the isolated site to deal with changing circumstances
08:18:15 <joehuang> how to do that?
08:18:24 <colintd_> I just want to highlight this fact and make sure that we have a plan on how this is allowed, and more importantly how the reconvergence works.
08:18:47 <zhipeng> maybe we put this as a new item ? :)
08:18:52 <colintd_> I'm not saying the prototype doesn't support this (I must confess I haven't had time to look yet), but wanted to raise the issue.
08:19:28 <zhipeng> I suggest let's mark this problem for use case #1
08:19:37 <zhipeng> and see how prototype goes
08:20:03 <zhipeng> or at least what prototype could provide for some insight on this issue
08:20:10 <joehuang> that's why we want to have keystone service in each site, therefore, no matter the site failure for earthquake, flooding, or anything else, other sites can still work.
08:21:30 <zhipeng> joehuang I think colintd_'s concern is a very legit one, let's keep the experiement going, and see how this question would be answered :)
08:21:55 <zhipeng> we could wrap this all in the end into the requirement doc
08:22:08 <joehuang> ok
08:22:35 <zhipeng> let's continue on use case #2, hope we could reach some conclusion today :P
08:23:48 <joehuang> if not each site installed with KeyStone service, we have to process "Escape from site level KeyStone failure" use case. I'll go on the prototype, and let's mark what colin's concerns and see how to address it
08:24:52 <zhipeng> #info if not each site installed with KeyStone service, we have to process "Escape from site level KeyStone failure" use case. I'll go on the prototype, and let's mark what colin's concerns and see how to address it
08:26:54 <joehuang> for use case 2, the requirement to OpenStack is 1) Cross Neutron L3 networking, 2) Cross Neutron L2 networking
08:27:58 <zhipeng> So for #2, I think this is a classic case of what colintd_ proposed in the email discuz, that we have both mgmt and NFVI requirements
08:28:29 <zhipeng> for #2 there is a large part of the deal involves mgmt
08:28:29 <joehuang> to Colin, what's your concerns on this use case requirements to OpenStack?
08:28:56 <zhipeng> I agree that some of the control/mgmt function should be handled by OSS/BSS or MANO
08:29:08 <colintd_> Absolutely.  The use case is about both the VNF networking and the management it requires
08:29:10 <joehuang> for example?
08:29:32 <zhipeng> but I think VIM should also be able to support the actual implementation of the upper decisions
08:30:52 <colintd_> I'm happy with the current usecase/requirements text in the etherpad.  However, agree with zhipengs efforts to produce a small number of coherent architecture diagrams to pull together all the usecases.
08:31:06 <fzdarsky> Hello. Can we narrow the problem space down to HA of independent clouds / VIMs on the same site, i.e. the same core router / not crossing the WAN?
08:31:30 <zhipeng> why not crossing WAN fzdarsky :P
08:31:31 <joehuang> same site, I assume
08:31:32 <fzdarsky> I think those architecture diagrams are great (thanks!) but lack the aspect of geos
08:31:44 <fzdarsky> see colintd_ 's mail
08:31:54 <joehuang> for cross site, it's something for the use case 3
08:32:01 <zhipeng> ah get it
08:32:10 <fzdarsky> yes. different use case, potentially different solutions
08:32:18 <colintd_> Agreed #2 is intrasite clouds, #3 is intersite clouds
08:32:37 <zhipeng> #agreed #2 is intrasite clouds, #3 is intersite clouds
08:32:47 <fzdarsky> as in: for intrasite, I would expect a single VNF instance
08:32:48 <zhipeng> bot cmd is your friend lol
08:32:51 <fzdarsky> which implies a single VNFM instance
08:33:16 <fzdarsky> for cross-site, it would be independent VNF instances
08:33:35 <fzdarsky> + VNFMs
08:33:36 <joehuang> not single VNFM, can be multiple VNFM, different VNF can be managed by different VNFM
08:33:43 <colintd_> Agree again.  Each VNFI has a single "owning" VNFM, but that VNFM may cover multiple clouds in a site
08:34:03 <fzdarsky> It's a question of fate sharing.
08:34:09 <zhipeng> fzdarsky could there be multiple VIMs in one site ?
08:34:18 <colintd_> (The whole internal structure of the NFVO and VNFM is not covered by the ETSI NFV docs, but must clearly be HA and cross-cloud to hit the require availability numbers)
08:34:20 <fzdarsky> zhipeng, absolutely!
08:34:32 <joehuang> VNFM is software to manage VNF
08:34:46 <Malla> Yes, VNFM can cover multisite VNFs
08:35:06 <colintd_> Absolutely multiple VIMs per site.  As per my previous email, this is a very common model in the data world to get availaibblity
08:35:44 <fzdarsky> The scope of the VNFM is defined by the scope of the VNF instance.
08:35:44 <zhipeng> so for #2 what could we settle on requirements now?
08:36:37 <zhipeng> guys, could we agree single site across VIM overlay L2 networking being one of the requirements of #2?
08:36:48 <Malla> but this scope is not limited to a site, right
08:37:05 <fzdarsky> Malla, for use case #2 it is
08:37:20 <joehuang> to Malla, "VNFM can cover multisite VNFs" -> "VNFM can cover multiple/multisite VNFs"
08:37:21 <colintd_> I think L2 inter-cloud intra-site is a common approach, but L3 solution in site is a valid (if harder option).
08:37:57 <zhipeng> so both L2 and L3 should be addressed, agree ?
08:38:00 <Malla> Agree
08:38:13 <colintd_> To my the biggest difference between #2 & #3, is that #2 is all about maintaining media/signalling and calls (which requires IP transfer), whilst #3 is about restoring/continuing service but most likely not calls.
08:38:14 <fzdarsky> agree (as long as only intra-site :))
08:38:41 <joehuang> agree to coline
08:38:43 <zhipeng> yep maintaining calls would be nightmare for #3 :P
08:38:45 <zhipeng> so
08:38:49 <colintd_> #3 does not require special openstack networking support, but #2 does.
08:39:00 <colintd_> agree
08:39:32 <zhipeng> #agreed inter-cloud intra-site L2 and L3 networking enhancement is one requirement from OPNFV Multisite to OpenStack
08:39:40 <fzdarsky> IP continuity across geos is a road we don't want to take; this is why clearly separating #2 and #3 is important.
08:39:42 <joehuang> Ok, can we get conclusion, that overlay L2 networking across Neutron service intra-site is required in OpenStack?
08:40:02 <zhipeng> I think we just voted yes on this
08:40:27 <joehuang> for #3, it's more about volume replication for restoration purpose
08:40:38 <zhipeng> yes
08:40:39 <colintd_> I think we're agreed that #2 needs some technology for allowing transfer of IP addresses/traffic.  This could be either L2 or L3.
08:41:11 <zhipeng> colintd_ what would this translated to a requirement for OpenStack?
08:42:06 <joehuang> Agree, but we need to describe that from two aspect, one if for VNF communication to other VNF, another one is for VNF internal communition for heart-beat, session replication
08:43:26 <colintd_> For L2 the major requirements relate to config/management of those networks.  For L2 do you need to use provider networks?  Exactly how do you disable anti-spoof support? etc.  For L3 it might make sense to have a common neutron api for the take IP/ free IP support, which can then be plumbed onto multiple underlying technologies.  In fact it may even make sense to use the same API for L2, just have it trigger GARP.
08:43:59 <zhipeng> #info For L2 the major requirements relate to config/management of those networks.  For L2 do you need to use provider networks?  Exactly how do you disable anti-spoof support? etc.  For L3 it might make sense to have a common neutron api for the take IP/ free IP support, which can then be plumbed onto multiple underlying technologies.  In fact it
08:43:59 <zhipeng> may even make sense to use the same API for L2, just have it trigger GARP.
08:44:23 <zhipeng> colintd_ how about joe's comment on the two aspects ?
08:44:42 <fzdarsky> #info may even make sense to use the same API for L2, just have it trigger GARP.
08:44:50 <fzdarsky> (missed the last piece ;))
08:45:58 <colintd_> Yes, worth pulling out that we have the need to route external traffic to the VNFI split across two clouds, and the somewhat separate need to setup the intercloud comms for intra-VNF traffic.
08:46:12 <colintd_> The latter leads onto cross-cloud tenant networks
08:46:47 <joehuang> agree, it's cross-cloud tenant networks for intra-VNF traffic\
08:47:33 <fzdarsky> For the routing piece, the question is whether this leads to a req'ment on OpenStack or not.
08:47:49 <fzdarsky> (if we want to fail over between independent OpenStack instances).
08:47:51 <zhipeng> does Neutron support it right now?
08:48:30 <colintd_> I didn't think it did, but could be wrong
08:48:49 <joehuang> currently you have to routing through external network(this is provider network) or through inter-VPN  connection
08:49:07 <zhipeng> then I think it should be a requirement for OpenStack
08:49:14 <colintd_> The key issue is that of scope, where you have two clouds coordinating/interacting (hence being covered by multisite).
08:49:31 <colintd_> Agree
08:50:15 <Malla> Sorry guys, just for my information, we are planning to discuss an architecture proposal (i.e. Identifying best option in the proposal) in this meeting or next meeting?
08:50:33 <zhipeng> we could discuss it firstly via email
08:50:40 <Malla> ok
08:50:42 <Malla> thanks
08:50:45 <zhipeng> and then in the meeting :)
08:50:48 <joehuang> we need tenant level L2 network across neutron to bride the tenant router for L3. for L2 , we need cross Neutron overlay L2 network (some application using L2 network for session replication)
08:50:50 <zhipeng> save time for use cases
08:52:13 <zhipeng> ok then could we sum up a second req out of #2? that is given intra-site inter-cloud we need L2/L3 IP traffic transfer
08:52:39 <zhipeng> ? or someone reword it for a better description :P
08:53:22 <colintd_> I'll have a go at a reword
08:53:44 <joehuang> thanks
08:54:04 <zhipeng> so then we could agree upon a second req on this aspect, right?
08:56:11 <colintd_> Yes
08:57:06 <zhipeng> #agreed second req out of use case #2 given intra-site inter-cloud enhancement on  L2/L3 IP traffic transfer (inter and intra VNF) is a requirement from OPNFV Multisite to OpenStack
08:57:26 <zhipeng> #action colintd_ to reword the req to be more accurate :)
08:58:48 <zhipeng> okey I think we have a pretty awesome session today :)
08:59:15 <colintd_> Agree.  Lots of ground covered and some real improvements in our shared vision
08:59:30 <joehuang> time flies, it take time to reword the requirement. Let's have a discussion on arhichtecture and use case #3 discussion. and review on the reword
08:59:48 <joehuang> in the next meeting
09:00:07 <zhipeng> and arch discussion could be done in ML
09:00:13 <joehuang> agree
09:00:24 <joehuang> thanks you all for the meeting
09:00:31 <zhipeng> thank you
09:00:37 <joehuang> see you next time
09:00:56 <fzdarsky> thanks!
09:01:02 <joehuang> #endmeeting