16:02:09 #startmeeting JOID 16:02:09 Meeting started Wed Nov 2 16:02:09 2016 UTC. The chair is narindergupta. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:02:09 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:02:09 The meeting name has been set to 'joid' 16:02:19 ssh -i .ssh/id_maas ubuntu@10.9.1.51 - wrong key 16:02:24 which key should I be using? 16:02:24 #info Narinder Gupta 16:02:39 from jumphost with jenkins user 16:02:46 run ssh ubuntu@ip 16:02:55 should pick the default key which is in MAAS 16:04:47 mbeierl, joining joid call? 16:05:22 in a conflicting work call 16:05:32 irc only for now .... 16:05:43 bryan_att, r u joining? 16:05:52 yes 16:05:54 #Agenda 16:06:16 #info Anand Gorti 16:06:52 #info Mark Beierl 16:07:30 #info Artur Tyloch 16:08:22 #info Bryan Sullivan 16:08:25 #topic agenda bashing 16:08:31 #link https://etherpad.opnfv.org/p/joid 16:10:19 #topic Colorado release 2.0 status 16:15:38 #topic D release implementation 16:37:00 #topic lxd in C 16:37:06 #link https://etherpad.opnfv.org/p/lxd 16:50:27 #topic OPNFV Plugtest 16:50:48 #info Lenovo will send HW for Plugtest 16:51:17 OCP nodes, small configuration , 3 servers + 1 jum host + required switches 16:51:28 Addtionally GP servers. 16:57:06 #topic Movie project 17:56:11 Narinder Gupta proposed joid: modified to have better support of maas 2.0 https://gerrit.opnfv.org/gerrit/23883 17:56:23 narindergupta: ok, I have verified that all 4 bare metal nodes have all interfaces working and VLANs working as documented in the wiki 17:56:37 narindergupta: so I have no idea why MAAS cannot see the VLANs 17:56:59 Merged joid: modified to have better support of maas 2.0 https://gerrit.opnfv.org/gerrit/23883 17:57:50 can u enable dhcp in maas on that interface and see whether gets the ip? 17:58:15 r u able to ping jumphost ip? 17:58:27 first, did you want to look at the configuration that I did on the bare metal servers? 17:59:50 can u pastebin? 18:01:28 http://paste.ubuntu.com/23417192/ 18:01:48 that's m1. .52, .53, .54 for m2, m3, m4 18:13:35 http://paste.ubuntu.com/23417224/ 18:13:58 narindergupta: ^ shows the matrix of eth interfaces, subnet and hosts 18:43:42 Narinder Gupta proposed joid: modified worker mutiplier from 1.0 to 1.1 as 1.0 is considered as integer rather than floating. https://gerrit.opnfv.org/gerrit/23903 18:47:20 Narinder Gupta proposed joid: modified worker mutiplier from 1.0 to 1.1 as 1.0 is considered as integer rather than floating. https://gerrit.opnfv.org/gerrit/23903 19:17:59 mbeierl, sorry i had another meeting and got busy there. If this looks good for you lets change the labconfig.yaml reflecting that 19:18:33 ok. I thought the labconfig.yaml already reflected it, but that was there the question about eth1 vs. mac address came from 19:18:57 in the labconfig.yaml the eth0, eth1 have the mac addresses as listed in the wiki 19:19:16 yet, once deployed, the mac addresses show up under different ethN devices 19:35:26 mbeierl, eth0 and eth1 are not hard coded and we can reflect the same. As labconfig.yaml created but never verified 19:35:48 can we change in ;abconfig.yaml as per deployed nodes 19:36:15 what needs to change? 19:36:50 I'd love to change it, but seeing as I was the one who put that together, thinking it does reflect the deployed nodes, I don't know how to fix it 19:41:02 mbeierl, change in labconfig.yaml as per deployed nodes. 19:41:09 and mentioned this mac is for that space 19:41:16 I wrote the labconfig according to what is deplyoed 19:41:23 therefore it is already done 19:42:34 so why does it not work? 19:48:01 ok issue id with vlan id 905 for public ip not seen 19:48:18 so how do I change the yaml to fix that? 19:50:34 mbeierl, this is in labconfig.yaml 19:50:35 - ifname: eth1 19:50:36 spaces: [admin] 19:50:36 mac: ["00:1E:67:D4:30:38"] 19:50:36 - ifname: eth2 19:50:36 spaces: [data] 19:50:37 mac: ["00:1E:67:C5:5B:08"] 19:50:41 - ifname: eth3 19:50:43 spaces: [public] 19:50:45 mac: ["00:1E:67:C5:5B:09"] 19:50:47 which does not reflect the same 19:50:58 as eth3 is for pxeboot 19:51:05 which is admin 19:51:17 while here in labocnfig written as eth1 19:51:27 similarly no mentioned of vlans here 19:51:48 so labconfig.yaml does not reflect the correct settings 19:52:21 like eth2 19:52:24 nope. The mac address for 00:1E:67:C5:5B:09 is not supposed to be for PXE boot. Don't know why MAAS chose to use it for that? 19:52:58 smaas is not using 09 19:53:21 maas is using 38 19:53:44 thats why i am saying labconfig.yaml is not reflecting deployed node so lets correct it 19:54:21 mac 30:38 should be pxe boot and marked as Admin 19:54:45 so what does MAAS define as Admin? Perhaps it would help if I actually knew what networks are needed and what their names are 19:54:51 I am going by what is in the wiki 19:55:09 spaces: [admin] 19:55:09 mac: ["00:1E:67:D4:30:38"] 19:55:20 Um... does that not mean that mac 38 is marked as admin? 19:55:30 is that not what you just said it should be? 19:56:55 (03:54:21 PM) narindergupta: mac 30:38 should be pxe boot and marked as Admin 19:57:03 ^ It is marked as such in the yaml 19:57:35 yes 19:57:43 so, the yaml is correct? 19:57:44 30:38 will be admin for pxe boot 19:57:52 no it say eth1 in yaml 19:58:02 aso change it from eth1 to eth3 19:58:25 and we do not have issues on admin network 19:59:16 ok, I'm going to back up and ask again here: why is it being called eth3 in the deployed ubuntu OS? How, before doing a test deployment, can someone possibly know what to call it? 19:59:58 user does not define it 20:00:01 It was eth1 under fuel. It has not changed positions, so is it always a "do a test deployment and see what the interface assignment ends up being"? 20:00:25 it is picked up by the distro based on device funtion etc and we see that changing 20:01:04 so, again: in order to write a labconfig.yaml file, I need to guess, perform a deployment, read the real values, and then re-write the file 20:01:06 correct? 20:01:08 depend upong how kernel sees it. Industry tried to stramlined it but not finding very helpful 20:01:18 correct 20:01:44 my prefer way is to go by mac and matched naming convention 20:01:53 how do I do that? 20:02:06 I have been asking about this for the past day 20:02:14 check mac and assign whether is admin network or which network 20:02:35 do mac and network type mapping 20:02:36 I kept asking "does it matter that the ethN and MAC addresses assigned are different" and you did not indicate this was a problem 20:02:51 do I need to specify the ifname at all then? 20:02:53 again thats not the problem 20:03:07 you can skip 20:03:11 skip what? 20:03:12 make it blank 20:03:19 make what blank? 20:03:23 specifying eth name 20:03:29 ifname 20:03:35 ok, so why did you just tell me to change it from eth1 to eth3? 20:04:04 because you are getting confused with eth3 and how it is deployed 20:04:08 nics: 20:04:09 - spaces: [admin] 20:04:09 mac: ["00:1E:67:D4:30:38"] 20:04:09 spaces: [data] 20:04:09 mac: ["00:1E:67:C5:5B:08"] 20:04:09 - spaces: [public] 20:04:09 mac: ["00:1E:67:C5:5B:09"] 20:04:12 so that will work? 20:04:14 so i asked you to match it 20:04:38 it should work if does not then i need to fix in my code 20:06:26 and what about the VLANs than 20:06:35 do I need to change anything in the yaml for them? 20:06:43 should mention under space 20:07:01 how? 20:07:03 currently vlan is blanked 20:07:14 check at bottom of labconfig.yaml 20:07:21 what about the fact that there is bridge: in there? 20:07:25 vlan is blank for all 20:07:33 on the hosts there is no bridge 20:07:55 jumphost i created a bridge 20:08:04 for all network 20:08:09 so bridge is just for jump host 20:08:16 correct 20:08:16 in the labconfig.yaml 20:08:39 and it is smart enough to know that there is no VLAN id over top of the bridge 20:08:40 basically in opnfv section you are telling your network spaces 20:08:55 maas should be aware ] 20:09:00 right, but there is no bridge in the network space of the bare metal hosts 20:09:21 so I just want to make sure it is clear we are mixing the two together in the one file 20:09:36 what are we mixing? 20:09:57 bridge is for MAAS VM to say which bridge need to use for admin network space 20:10:08 and this is CIDR for admin network 20:10:08 jumphost with bridge interfaces and bare metal with VLAN tags 20:10:38 those bridges are mapped to vlan tag network 20:10:42 run ifconfig -a 20:10:48 on jumphost you will see 20:11:08 but if you can verify i have right interface with right vlan and bridge will help 20:11:54 yes, I know that, but on the bare metal, an interface called eth0.904 needs to be created. On the jump host there is an interface called brAdmin - without a vlan tag. Just want to make sure that we are specifying both bridge interface and a VLAN tag at the same time 20:13:20 we did it in jumphost manually and you can verify if there is gap we can fix jumphost 20:14:46 But wait - how did MAAS know to create eth0.904? There is no mention of .904 in the labconfig.yaml file? 20:14:59 as you said - all VLANs are empty 20:20:46 mbeierl, i have to check that somhoe maas knew on this network vlan 904 is tagged 20:21:04 but could not fon the info on other network 20:21:11 so with that being said, what do I put into the labconfig.yaml for vlans? 20:21:51 For an example, we have the public space, which is MAC 00:1E:67:C5:5B:09 on the first server 20:21:58 just give the vlan no 20:22:07 This MAC supports both tagged and untagged 20:22:37 we should give both space name then 20:22:37 as "brStor", it passes traffic without a VLAN at all 20:22:48 ok which is good 20:22:50 are the names arbitrary? 20:23:26 maas rediscover the network and find out whether it make sense or not and then configure 20:23:44 if you try to assign wrong subnet it will refuse 20:23:59 everything based on the interfqce exposed to maas in a VM 20:24:03 Let me walk through the 5B:09 mac example again... 20:24:20 that is one of reason i am moving maas out of VM as well 20:24:20 Does this work 20:24:23 - spaces: [public, storage] 20:24:23 mac: ["00:1E:67:C5:5B:09"] 20:24:32 it should but we can try 20:24:55 because it has public with VLAN 905, and storage without VLAN id 20:25:44 ok no problem lets try otherwise we can configure it manually 20:25:56 after maas installas 20:26:28 ok, then I also have for the other MAC: 20:26:34 spaces: [data, management] 20:26:34 mac: ["00:1E:67:C5:5B:08"] 20:33:15 ok 20:34:38 now, does the word "type" in the spaces match the one of the values in spaces: key? 20:35:01 like, do I need to add a type: management now that I added spaces: [data, management] 20:37:42 Do I need to redeploy MAAS now with the 00-maasdeploy.sh ? 20:49:19 narinder`: looks like the ifname is required: 20:49:21 ifnamelist.add(nic['ifname']) 20:49:21 KeyError: 'ifname' 04:03:04 Narinder Gupta proposed joid: modified worker mutiplier from 1.0 to 1.1 as 1.0 is considered as integer rather than floating. https://gerrit.opnfv.org/gerrit/23903 04:03:33 Merged joid: modified worker mutiplier from 1.0 to 1.1 as 1.0 is considered as integer rather than floating. https://gerrit.opnfv.org/gerrit/23903 11:51:17 Narinder Gupta proposed joid: added support for maas 2.0 in non virtualized environment. https://gerrit.opnfv.org/gerrit/23931 12:56:35 Narinder Gupta proposed joid: added support for maas 2.0 in non virtualized environment. https://gerrit.opnfv.org/gerrit/23931 12:57:14 Merged joid: added support for maas 2.0 in non virtualized environment. https://gerrit.opnfv.org/gerrit/23931 13:03:52 Narinder Gupta proposed joid: adding line for echo. https://gerrit.opnfv.org/gerrit/23933 13:17:53 Narinder Gupta proposed joid: adding line for echo. https://gerrit.opnfv.org/gerrit/23933 15:24:03 narindergupta: fyi - 00-maasdeploy hung at: 2016-11-02 14:08:13,719 DEBUG Executing: 'ssh -i /home/jenkins/.ssh/id_maas -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no ubuntu@10.9.1.5 grep -m 1 "MAAS controller is now configured" <(sudo tail -n 1 -F /var/log/cloud-init-output.log)' stdin='' 15:24:05 will try again 15:42:43 Has anyone seen this before? 15:42:45 django.db.utils.OperationalError: Problem installing fixture '/usr/lib/python2.7/dist-packages/metadataserver/fixtures/initial_data.yaml': Could not load auth.User(pk=1): deadlock detected 15:47:15 2016-11-03 15:32:51 UTC ERROR: deadlock detected 15:47:15 Process 21310: INSERT INTO "auth_user" ("id", "password", "last_login", "is_superuser", "username", "first_name", "last_name", "email", "is_staff", "is_active", "date_joined") VALUES (1, '!', '2012-02-16 00:00:00', false, 'maas-init-node', 'Node initializer', 'Special user', '', false, false, '2012-02-16 00:00:00') 15:47:15 Process 22046: DROP TRIGGER IF EXISTS auth_user_user_create_notify ON auth_user; 15:47:17 mbeierl, sure that was one issue which is race condition and we could not solve for maas deployment. Hopefully MAAS 2.0 will be better 15:48:08 it happens if we use maas-deployer 16:20:34 narindergupta: ok, happened twice in a row (and I think I found your bug report on it...) Do I just try again and hope for the best? 16:30:00 best is reboot your jumphost once 16:30:03 and then try 16:30:13 or run sudo apt-get update 16:30:27 and sudo apt-get dist-upgrade and reboot 16:40:02 ah, ok. Will try that 16:40:09 oh - it appears to have worked anyway this time 16:41:42 :) 16:42:03 maas-deployer tool is bit crazy and no support 16:42:29 that was one of reason to abandon that in D release and use the script to deploy and configure 16:42:29 which was something that I thought about when you were talking about customer and production uptake for JOID - as part of the JOID meeting yesterday 16:42:42 yes 16:42:47 narindergupta: really? I missed that, awesome plan 16:43:01 in D release no maas-deployer only MAAS and JUJU 16:43:07 also no juju-deployer tool 16:43:13 so native maas and juju 16:43:26 wow. that is significant, but I believe the right decision 16:43:38 I am very glad to hear that 16:44:26 thanks 16:44:37 and i have script for virtual deployment with kvm today 16:44:37 ok, so the deployment appears to have completed, but now MAAS only sees eth1.905, no longer eth0.904. *sigh* Don't know what I did wrong now 16:44:41 working for baremetal 16:44:56 :) 16:45:17 eth 0 was suppose to work eth1 was issue 16:45:24 now its other way around 16:45:44 yep! 16:46:02 I do have 3 fabrics now though 16:46:33 ok 16:46:39 let me check in GUI 16:46:51 just no Mgmt 904 vlan 16:47:38 no worries it seems nodes were not assignment but vlan create 16:47:46 let me do it manually in maas 16:50:05 done 16:50:11 now only public is missing 16:50:28 how did you create it? Just for my knowledge 16:50:35 is there a maas cli for that? 16:50:54 select the interface in node detail gui 16:50:56 narindergupta: public is there on eth1.905 16:51:00 showa the option add vlan 16:51:33 why public showd 10.9.13 16:51:40 rather than 10.9.15 16:52:00 or you marked as 10.9.13 for public 16:52:05 instead of 10.9.15 16:53:20 that is what the pod9 wiki shows: 10.9.13 is supposed to be public, not .15. So I made the labconfig.yaml match the wiki 16:53:29 is that a problem that public is 13, not 15? 16:53:45 http://paste.ubuntu.com/23417224/ 16:53:52 ok 16:53:57 then we are good 16:54:06 so cahnge the floating ips on public 16:54:20 I just wanted to make my mind clear. Keep things the same where I can... 16:54:26 is .13 public in pod5? 16:54:43 no it is .15 16:56:11 Is it a problem that eth0 shows "Unconfigured" for its subnet? 16:56:59 it could not find any subnet 16:57:13 except vlan 904 16:57:25 is that a problem from my labconfig.yaml, or does that mean something else is wrong? 16:57:43 may be tagged and untagged are not in sync 16:57:55 btw which fabric should eth0 on? 16:58:03 i can cahnge and try in GUI 16:58:20 how do I know what fabric-1 is? 16:59:30 click on subnets section 16:59:58 all fabrics listed there 17:00:12 actually it could be maas 1.9 17:00:18 may got resolved in 2.0 17:00:29 subnets show ip addresses, not fabrics. 17:00:36 where tagged and untagged vlan can not be configured with different subnet 17:00:57 oh - wrong "subnet" in ui 17:01:10 I was clicking the one on the left, not the top banner 17:02:21 :) 17:07:04 Ok, something is definitely not adding up. MAAS shows fabric-1 as 904 with subnet 10.9.12, but vlan 904 is supposed to be 10.9.14 17:13:51 10.9.12 is fabric one with vlan id 90 17:13:53 904 17:13:58 which is correct 17:14:05 narindergupta: do you know how fabric 1 ended up being created with VLAN 904 as subnet 10.9.12? The labconfig says 10.9.14 is 904 17:14:27 - type: management 17:14:27 bridge: brMgmt 17:14:27 cidr: 10.9.14.0/24 17:14:27 gateway: 10.9.14.1 17:14:27 vlan: 904 17:14:27 check into jumpshot 17:14:34 brMgmt 17:14:36 auto brMgmt 17:14:36 iface brMgmt inet static 17:14:36 address 10.9.14.1 17:14:36 netmask 255.255.255.0 17:14:36 bridge_ports em3.904 17:14:36 bridge_stp off 17:14:36 bridge_fd 0 17:14:37 bridge_maxwait 0 17:14:48 904 all says it is on 14, not 12 17:14:53 which is fabric 3 17:15:08 somehow it messed up then 17:15:13 so why is 904 on fabric-1? 17:15:28 right, that is what I don't know how to fix, or how it happened 17:16:13 it means switch says something else and configured something else 17:16:41 as per this vlan 904 is on 10.2.12 17:16:48 sorry 10.9.12 17:16:58 how did it get that from the switch? When I configured the interfaces as shown in the yaml and the jump host, it all works 17:17:38 lets ask this on #maas channel 17:18:24 I'm there now, but I do not know how to ask this question 17:18:47 mbeierl, i think same query how maas determines the vlan 17:18:56 and how it related to subnet 17:28:12 mbeierl, i think i got where we create vlan 17:28:24 it is in 00-maasdeploy.sh 17:28:32 ok 17:28:35 VLAN customization 17:28:52 i use those two commands for intel pod9 17:28:53 crvlanupdsubnet vlan904 fabric-1 "MgmtNetwork" 904 2 || true 17:28:53 crvlanupdsubnet vlan905 fabric-2 "PublicNetwork" 905 3 || true 17:29:19 case "$labname" in 17:29:19 'intelpod9' ) 17:29:19 maas refresh 17:29:19 crvlanupdsubnet vlan904 fabric-1 "MgmtNetwork" 904 2 || true 17:29:19 crvlanupdsubnet vlan905 fabric-2 "PublicNetwork" 905 3 || true 17:29:20 crnodevlanint $vlan905 eth1 || true 17:29:22 crnodevlanint $vlan905 eth3 || true 17:29:24 enableautomodebyname eth1.905 AUTO "10.9.15.0/24" || true 17:29:28 enableautomodebyname eth3.905 AUTO "10.9.15.0/24" || true 17:29:30 enableautomodebyname eth0 AUTO "10.9.12.0/24" || true 17:29:32 enableautomodebyname eth2 AUTO "10.9.12.0/24" || true 17:29:34 ;; 17:29:36 esac 17:29:38 which might be wrong lets fix that 17:30:29 i do lot here 17:32:20 sorry just now open the file and saw code for pod9 17:32:23 oh - that is actual code, not generated? 17:32:28 got it 17:32:35 yeah 17:33:09 totally missed from mind. Suddenly i opened this file and saw that 17:33:24 i was under inmpression that everything is generated for all labs 17:33:25 for some reason I thought that might be generated from the yaml 17:33:30 exactly, same here 17:33:49 this vlan part looks like i hard coded it for pod9 17:34:03 for other labs it is generated from labconfig 17:37:06 going to step away for a little bit - need food. We can re-run the deploy later if you: 1) change the code, or 2) cause it to be generated like some of the others.... 17:41:27 mbeierl, no need to change it now 17:41:35 for now lets fix for one node manually 18:21:05 narindergupta: sorry - how do we fix it manually? 19:37:28 narinder: wondering if you can help explain what the node_group_ifaces section of deployment.yaml is for, and if that needs to be updated to match the fabrics? 19:46:09 mbeierl, that section defines the fabric in MAAS 19:46:36 each subnet in MAAS nodes corresponds to one fabric in MAAS 19:47:00 and we need to enable DHCP in MAAS for that fabric if we wants MAAS to provide dhcp 20:37:25 ok, because it really does not match anything - not reality, nor the proposed in labconfig.yaml 21:11:44 narinder`: ok, still no VLANs showing up 10:50:36 narinder`: ping when you are connected please see if you know what might be the issue here; neutron-gateway is persistently failing deployment with config-changed hook failure. The logs follow and indicate to me some sort of configuration error in the charm maybe (see 127.0.0.1 reference) 10:51:02 narinder`: neutron logs https://www.irccloud.com/pastebin/pZAQXPVY/ 10:51:27 narinder`: this is on stable/colorado 14:28:11 narinder: ping 14:28:25 bryan_att, hi 14:28:40 did you see my note last night? 14:28:50 issue with neutron/gateway 14:30:18 bryan_att, no i have not recieved any. Did you send me an email? 14:30:29 in IRC 14:30:38 ping when you are connected please see if you know what might be the issue here; neutron-gateway is persistently failing deployment with config-changed hook failure. The logs follow and indicate to me some sort of configuration error in the charm maybe (see 127.0.0.1 reference) 14:30:48 neutron logs https://www.irccloud.com/pastebin/pZAQXPVY/ 14:31:26 bryan_att, this is stable to master? 14:31:32 (I assume that when you connect to IRC you check the logs since you were last on, in case someone has reported and issue) 14:31:37 yes, stable 14:32:27 unfornately my irc is not showing 14:34:14 OK, I use irccloud which gives me the log since I was last on... I'll keep that in mind for the future that yours doesn't work that way. 14:35:02 did you see the logs I just posted? looks like neutron is not configured correctly somehow 14:52:38 bryan_att, can u post me bundles.yaml? 14:53:03 https://usercontent.irccloud-cdn.com/file/rfJYbLHp/bundles.yaml 15:13:12 narinder: ping me when you have idea what's up. I posted the bundles.yaml 15:14:21 bryan_att, yeah i looked into bundles and i am not seeing anything wrong in bundles 15:14:55 can u send me complete juju status --fomrat=tabular somehow i think could be issue with rabbitmq? 15:15:02 as it could not connect 15:15:27 bryan_att, also please send me complete log file for network-gateway as well 15:29:59 https://www.irccloud.com/pastebin/qm1DTS9e/ 15:31:00 which logfile is for "network gateway" - neutron-dhcp-agent.log neutron-l3-agent.log neutron-lbaas-agent.log neutron-metadata-agent.log neutron-metering-agent.log 15:36:05 narinder: I posted the juju status. Which neutron log are you looking for? 15:41:09 narinder`: ping - I posted the juju status. Which neutron log are you looking for? 15:54:43 narinder`: as an aside, it looks like the VLAN stuff is just not going to work in pod 9. MAAS cannot handle seeing the interfaces as non-vlan (because they are bridged to the VM and so appear as standalone) while they actually are VLANs on the bare metal nodes. 16:21:39 mbeierl, so whats the advice can we do jsut vlan only? 16:22:01 May be MAAS 2.0 works better. Any information on #maas channel? 16:22:52 narinder: when you get a chance i posted the status and had a question re which neutron log you wanted 16:24:11 bryan_att, yeah looking into it and also trying to reproduce in VM in one of the node. 16:24:20 ok thanks 16:24:48 and for neutron log i am looking for juju /var/log/neutron/ all files 16:24:53 whichever application 16:27:02 narinder: is "juju /var/log/neutron/ all files" a juju command? 16:28:18 narinder: neutron-l3-agent.log https://www.irccloud.com/pastebin/AgWrDAMz/ 16:29:02 narinder: neutron-l3-agent.log https://www.irccloud.com/pastebin/HS8bHUI4/ 16:30:33 narinder: neutron-metadata-agent.log (last line repeats many times) https://www.irccloud.com/pastebin/AaT3Rt2j/ 16:31:29 narinder: neutron-dhcp-agent.log (last line repeats many times) https://www.irccloud.com/pastebin/sFBpvkIo/ 16:32:42 narinder: neutron-metering-agent.log (last line repeats many times) https://www.irccloud.com/pastebin/ET7JJgfa/ 16:33:31 narinder: like pod5, I am not going to use the two additional VLANs. Just stick with the untagged interfaces as is 16:53:24 mbeierl, ok then in MAAS lets mrked the nodes to that fabric and assign a subnet then we are good 16:53:39 yes, I did that 16:53:55 I thought the 00-maas script was supposed to do that, but there appear to be errors when that part executes 16:54:16 + maas maas interface link-subnet node-d457fc22-a2a7-11e6-baab-5254007b4bb7 eth1 mode=AUTO subnet=10.9.12.0/24 16:54:17 {"subnet": ["Select a valid choice. That choice is not one of the available choices.", "This field is required."]} 16:54:19 mbeierl, cool lets reflect the same in labconfig.yaml as well so that once we start deployment it will be in sync 16:54:27 narinder already done 16:54:32 mbeierl, ok 16:54:38 lets use gui to change it 16:54:51 juju deployment started and MAAS updated with farbics for nodes 16:54:57 bryan_att, looks like this is issue ERROR neutron.agent.l3.agent [-] An interface driver must be specified 16:55:00 http://10.2.117.151/MAAS/#/node/node-d457fc22-a2a7-11e6-baab-5254007b4bb7 16:55:07 but we do define the interface in yaml 16:55:14 so not sure whats happening here 16:55:31 narinder: yes, that part is clear but why. Is it a config, charm, or what issue? 16:55:44 looks like config is not appiled 16:55:54 we do setup ext-port as eth1 right 16:56:04 yes 16:56:05 it should take care of that 16:56:24 and in ha deployment we can see everything is working 16:57:08 https://usercontent.irccloud-cdn.com/file/3V57bCrk/labconfig.yaml 16:57:37 i know this is good so do bundles.yaml 16:57:37 The labconfig has not changed. The control node has two NICs. 16:57:50 yeah 16:57:57 everything looks perfect 16:58:21 juju ssh neutron-gateway/0 16:58:26 and ifocnfig -a 16:58:30 ifconfig -a 16:59:18 hmmm I don't see eth1. Let me see if I have a bad connector. 17:00:36 narinder: eth1 is not defined for some reason https://www.irccloud.com/pastebin/3NxHJ0fy/ 17:01:07 sorry I am on m2 - I need to be on M1 hang on 17:01:37 if you did neutron-gateway/0 17:01:49 yes I did 17:01:58 then looks like neutron-gateway is on m1 17:02:05 is on m2 17:02:08 than m1 17:02:13 https://www.irccloud.com/pastebin/niAOXxAJ/ 17:02:43 yeah your neutron-gateway is on m2 17:02:50 rather than m1 17:03:10 that is the reason it is not working 17:04:06 In the past it was on M1... 17:04:18 narinder: why would it get setup on M2? 17:08:56 bryan_att, in our hyperconverged model it can be any node. But you can try -f dishypcon in your deployment 17:09:08 then it will use the server tagged as network 17:12:25 what is "-f dishypcon " and where do I add that as an option 17:13:49 this will disable the hyper converged architecture and go to legacy one and would expect the tags in the MAAS 17:13:54 which should be there 17:14:05 ok that option is in the deploy command? 17:14:11 correct 17:14:18 thanks I will try thay 17:16:22 bryan_att, i will stop reproducing the issue now as we know the actual issue now 17:23:08 narinder: thanks for your help 17:25:18 no problem 17:58:45 narinder: ok, JOID is deployed on pod 9, but I am back in the situation again where I have no networking. Cannot ping floating ips, cannot connect to instance consoles 18:01:19 mbeierl, instance console we do not enable the vnc 18:01:29 for networking let me have a look 18:01:35 ok 18:03:43 what is ect-port you used? 18:04:24 it shows data-port: br-ex:eth3 18:04:27 eth3 18:04:31 correct 18:04:32 but eth3 is admin port 18:04:54 os-data-network: 10.9.12.0/24 18:05:07 what is public network? 18:05:08 what is ext supposed to be then? 18:06:21 ext-port is used for floating ips and use to get external access to vms 18:06:37 it should be seperate from pxe network 18:08:18 is eth1 should be on 10.9.15.x network 18:08:36 if yes lets use eth1 in labconfig.yaml 18:08:43 instead of eth3 18:09:04 and also is there any gateway on 10.9.15.x ? 18:09:38 I guess, without being able to see what is done in pod5 or 6 I have no idea what I am supposed to put where 18:09:53 10.9.15.1 is the jump host 18:09:57 yes 18:10:18 I don't know if it's set up to route things off and act as a gateway 18:10:31 if you have a configuration guide or something I could follow, that would help 18:10:52 on jumphost i am forwaring this to 10.2.117 for external access 18:10:56 I'm trying to copy pod5 and 6, but they appear to be physically different in that they have eth5, where I don't 18:11:21 thats fine 18:11:33 if eth1 is your 10.9.15.x then use eth1 18:11:55 That is what I thought I used - according to maas 18:12:07 and also is no other gateway then make jumphost as gateway and enable the NAT routing 18:12:14 ok 18:12:20 so lets change it to eth1 18:12:36 It is eth1, isn't it? 18:12:37 and also enable NAT routing from 10.9.15.x subnet to 10.2.117.x 18:12:41 yes 18:12:49 it is eth1 on node 18:12:57 so what is the "it" that needs to change to eth1? 18:13:23 node 2 mac is 00:1e:67:e2:6c:75 18:13:31 and this should be 10.9.15.x correct 18:13:40 in labconfig.yaml 18:13:47 - spaces: [public] 18:13:47 ifname: eth1 18:13:47 mac: ["00:1E:67:E2:6C:75"] 18:13:49 yes 18:13:59 ext-port: "eth3" 18:14:08 change it to ext-port: "eth1" 18:14:25 ah. did not see that 18:14:30 :) 18:14:50 also enabled nat routing from 10.9.15.1 to 10.2.117.149 18:15:22 so that instances can access outside world then we can do it right way and everything should work 18:17:18 natting done 18:17:31 do I need to redeploy to change ext-port to eth1? 18:18:11 we can try changing the ext-port 18:18:17 but it may or may not work 18:18:23 lets redeploy 18:18:32 Is that a full maas redeploy> 18:18:35 ? 18:18:38 or just juju? 18:18:39 no maas 18:18:49 only ./deploy.sh 18:19:07 i have change eth3 to eth1 in all files 18:19:09 what about the copy of labconfig.yaml that is in the joid/ci directory? 18:19:20 i did already 18:19:24 you can verify 18:19:29 ah ok 18:19:55 so I need to clean something first, right? 18:20:01 yeah 18:20:12 just clean.sh 18:20:14 ci/clean.sh or cleanvm? 18:21:09 cool lets see 19:14:18 mbeierl, seems to be working and we will know soon as i started configuring juju on top of openstack 19:14:44 ok 19:21:44 mbeierl, looks like server can reach maas 10.9.15.5 19:22:21 lets enable dhcp on 10.9.15.x also on nodes 19:24:49 sorry - is that via the MAAS interface? 19:26:17 yeah 19:26:39 so we would need to release the servers to do that? 19:26:44 lets see 19:26:51 i am auto enabling it 19:26:52 oh - I see you did that 19:26:54 ok it 19:27:30 is brPublic is vlan? 19:27:33 no right? 19:27:56 brPublic is 19:28:19 ok then we need to make changes in nodes again and mark as 10.9.15. 19:28:28 also eth1.905 19:30:18 need to create vlan also 19:35:00 the VLAN stuff does not work 19:35:26 so we can change the jump host configuration to get rid of the VLANs in the /etc/network/interfaces 19:51:05 let me give a try 19:53:57 can u confirm node 1 00:1e:67:c5:5b:09 is on vlan 905 19:54:19 I thought we were getting rid of the VLANs 19:54:41 but, yes that would be 905 19:54:46 until we change the jumphost we can not 19:55:33 lets give this last try if does not work then we need to remove the bridge on vlan and i need to make sure i uses network interface without vlan 19:56:26 you can try, but I could not figure out how to tell MAAS that it does not have a VLAN when it is in MAAS, but does have one when it is bare metal 20:14:06 mbeierl, maas subnet is bridged 20:14:15 so maas can provide dhcp 20:14:35 now we need to ocnfigure it on baremetal and provite correct network in labconfig 20:14:41 which i am trying now 20:14:51 here eth1.905 will be created 20:14:54 but not eth1 20:14:59 ah, ok 20:15:25 and i am saying in labconfig to use eth1.905 for external communication 20:15:41 lets see 20:15:43 ? 20:15:54 keeping my fingers crossed 20:19:44 ok looks good so far 20:20:01 can reach maas also 20:20:12 as well as jumphost 20:21:11 also default gateway is set to 10.9.15.1 20:21:27 which is nat routed through 10.2.117.x correct? 20:31:58 outbound, yes 21:06:14 mbeierl, this data network is it vlan tagged or not? 21:07:54 cool its working 21:08:06 mbeierl, i can ping and ssh into floating ips now 21:08:34 and apt-get update also working 21:11:10 so vlan is working 21:11:24 only we need to configure it correctly in MAAS 22:38:24 bryan_att, how did installation progressing? 12:02:19 narinder`: the install failed for the same reason, even though this time the neutron/gateway was setup on rack1-m1. I do see that MAAS for some reason does not show the 2nd NIC on either machine. Seems there was some recent MAAS change that is preventing MAAS from correctly setting up the 2nd NICs. 12:04:01 narinder`: These are USB ethernet adapter ports that previously worked fine, i.e. were detected by MAAS. One other difference is that still get the MAAS "failed commissioning" and have to manually invoke commissioning. Perhaps in that process it's not finding the 2nd NIC now. 12:57:04 narinder`: I'm trying again this time adding the 2nd NIC explicitly to labconfig.yaml. Seems before it was discovering the NICs but now I have to add them explicitly. 14:46:44 narinder`: sorry - was done for the weekend went you sent that last message. I see that it is working, but if I need to deploy OpenStack via JuJu again at some point, is there anything special I need to do? 15:22:58 Mark Beierl proposed joid: Intel pod 9 updates https://gerrit.opnfv.org/gerrit/24023 18:01:36 narinder`: question for you - if I want to run as part of Jenkins job, how do I ask JOID to give me the admin-rc file so that I can talk to the deployed OpenStack? 15:04:16 Merged joid: adding line for echo. https://gerrit.opnfv.org/gerrit/23933 15:05:14 Narinder Gupta proposed joid: Intel pod 9 updates https://gerrit.opnfv.org/gerrit/24023 15:13:23 mbeierl, how are everything going on? 15:13:45 mbeierl, it seems vlan worked after definiing manualy as per the wiki. 15:14:27 sorry... listening to the TSC call. I had to redeploy again because the Ceph subsystem went down again. I could not delete over 50% of the volumes 15:15:58 so, how do CI jobs know how to find the API endpoints? The equivalent of downloading the .rc file? 15:22:37 narindergupta: is there an API from JuJu that I can use to get the tenant ID? 15:23:24 yuanyou_, mbeierl we do have the rc file in ~/ created by joid 15:24:17 use openstack project list and openstack endpoint list 15:24:23 to know the detilas endpoint 15:24:26 narindergupta: oh, wow. did not know that one even existed 15:24:32 that's all I need 15:24:33 :) 15:24:47 mbeierl, ok 15:24:53 there should be nova.rc 15:25:07 also in joid/ci/cloud/admin-openrc file 15:26:08 narindergupta: actually ~/admin.rc is incorrect 15:26:19 export OS_AUTH_URL=http://10.9.1.90:5000/v2.0 15:26:32 but keystone is actually at 1.166, not 1.90 15:26:42 and the date of admin.rc is about 1 week old 15:26:50 nova.rc looks to be correct 15:27:40 and neither nova.rc, nor cloud/admin-openrc have the tenant ID in them 15:29:31 mbeierl, once thats the file generated by openstack.sh after deployment 15:29:50 narindergupta: which file? 15:29:56 nova.rc 15:30:03 and cloud/admin-openrc 15:30:26 for now change keystone ip and try 15:30:42 after that you can run openstack command after source that nova.rc 15:31:35 openstack project list will give you project list and use openstack command you can get anything 15:32:01 we capture what we need to run openstack commands 15:34:35 right, but project list does not give me the tenant id 15:35:19 nova list --all-tenants 15:35:43 'nova list' shows the instance in admin tenant 15:36:19 i would look into this in case need more information http://docs.openstack.org/cli-reference/ 14:53:34 David_Orange, hi david good morning 14:53:52 Hi narinder, how are you ? 14:54:03 David_Orange, i am find how about you? 14:54:19 narindergupta: i am fine thanks you 14:54:30 David_Orange, our MAAS team develop a consolidate maas view across the data centre. 14:55:18 for example if you have 4 difference maas ergional controller. Then we have meta-maas which will take all four MAAS into single view and synchronize all images and users accross maas 14:55:19 narindergupta: what does it mean ? 1 MAAS for a multi pod ? 14:55:35 ok, one top maas 14:55:46 it can be usefull 14:55:55 yeah its like one view of maas for mutiple pods where you can synchrize the image and users accross all maas controllers 14:56:30 yeah we want to implement first in ornage lab if succcessfull then we can do it in Intel lab then we can show it to linux foundation. 14:57:01 narindergupta: why not, we have to think about it 14:57:20 David_Orange, sure let me know whenever we are ready? 14:57:58 we are thinking about rebuilding our DC, to have more homogeneous pods (5 identicals servers) 14:58:39 or even more dynamical later (in order to dynamicaly reserver x servers for a pod 14:59:08 i take a break with my team and let them know about your proposition 14:59:26 David_Orange, this is some thing nice and we can use maas across everywhere 14:59:53 where each use can be assigned one pod and their keys will be added to use by them 15:00:04 but first i have a question: does this top maas have all the configuration and can delegate nodes to sub maas, or is it submaas that share its config ? 15:00:30 David_Orange, this top maas just a read only view of other masses 15:00:42 ok 15:00:43 so that you will know the status 15:01:08 but user synchrization and image synchronization will occur across maases 15:01:14 ok 15:01:21 rack controller 15:01:31 it is much more for supervision 15:01:41 https://github.com/maas/meta-maas/blob/master/README.md 15:01:50 yeah like one view of data centre 15:01:55 s 15:04:00 i am not sure it will be usefull for my idea of dynamic pods, but we can test it 15:04:45 i take a break and talk with my team about that, then i need to talk about public api network 15:04:51 i come back 15:29:36 narindergupta: i am back 15:29:59 ok 15:30:02 narindergupta: we are ok for the trial of the meta-maas 15:30:40 but it will done in priority3 :) 15:30:40 1/ testing colorado with public api network 15:30:44 David_Orange, sounds good i am getting few clarification on the same and we can running it on one of maas server or in vm somewhere 15:30:56 ok sounds good to me 15:31:07 currenty moving to maas 2.0 work 15:31:14 2/ install pod2 with colorado and remote access to the horizon console 15:31:22 once done then public api issue will get resolved 15:32:29 as i undestood in ODS, you need as much public ip as containers 15:32:39 4 ips per service 15:33:13 yeah thats rue and we can do it for required services 15:33:23 not for all initially. 15:33:50 in future charm team is thinking fo enbedded haprpoxy so that single ip can be used. 15:34:02 but it will take more time though not immediate 15:35:16 the haproxy, is the solution i was using 15:35:29 so i agree :) 15:36:19 so for now, we need 30 to 40 ips ? 15:37:49 so you take them on the public network IP range or do you need another public network ? 15:43:15 narindergupta: ^ 15:46:09 David_Orange, i will use them on the public network ip rnage 15:46:32 David_Orange, do we have enough if not then we can have another public network 15:46:47 we do not have enough 15:47:20 so in that case we can have seperate network 15:47:27 like in intel labs 15:47:30 can we have one public network for infra access (api + horizon) and another for public ip for VM ? 15:47:38 ok great 15:47:50 i already prepared them :) 15:47:54 David_Orange, :) 15:49:05 narindergupta: on the top router, i have to configure the switch 15:50:15 is the public network (for VM) can be handled on a VLAN or is it better to handle it on access vlan (like today) 15:51:05 or the better can be to share admin network and api network on the same link with vlan 15:51:40 it can be vlan 15:52:46 ok 15:53:00 can you share a labconfig with vlan ? 15:53:25 David_Orange, that changes are required manually in maas today 15:53:38 ok 15:53:39 but in MAAS 2.0 i will see how can i implement 15:53:57 so the first step is to install maas 2 15:54:04 correct 15:54:16 it is available for colorado or only for master ? 15:54:17 i have script to do that but that won't add nodes yet 15:54:20 as wip 15:54:27 master onlu 15:54:29 only 15:54:31 ok 15:55:32 for pod1 it is not a requirement but for pod2 i will need a stable install (but it can be master if master is stable) 17:05:07 narindergupta: the meeting is now or in 1 hour, we are in winter time now 17:05:12 in France 17:05:18 arturt_: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 17:05:25 #endmeeting