15:05:45 <CASP3R> #startmeeting weekly integration meeting
15:05:45 <odl_meetbot> Meeting started Thu Sep 11 15:05:45 2014 UTC.  The chair is CASP3R. Information about MeetBot at http://ci.openstack.org/meetbot.html.
15:05:45 <odl_meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:05:45 <odl_meetbot> The meeting name has been set to 'weekly_integration_meeting'
15:06:11 <CASP3R> #chair LuisGomez catohornet
15:06:11 <odl_meetbot> Current chairs: CASP3R LuisGomez catohornet
15:09:21 <CASP3R> #topic project update
15:09:40 <CASP3R> #info all testing good for base_of13  (OSGi)
15:10:01 <CASP3R> #info karaf all testing is all good but 1 test failing around topo in performance
15:26:19 <CASP3R> #info to get another Robot VM we have to move to rackspace (static) lab
15:34:04 <odp-gerritbot> Priyanka Chopra proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11036
15:36:03 <rexpugh> @chrisprice - ping rex when this meeting wraps up
15:36:35 <CASP3R> rexpugh Chris Price doesn't hang around her
15:36:44 <CASP3R> This is Chris O'Shea
15:37:28 <rexpugh> thanks chris O
15:53:33 <odp-gerritbot> A change was merged to integration: Updating and fixing xmls  https://git.opendaylight.org/gerrit/11045
17:57:22 <odp-gerritbot> Abhishek Kumar proposed a change to integration: Basic recovery scripts  https://git.opendaylight.org/gerrit/11064
21:04:51 <odp-gerritbot> Priyanka Chopra proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11036
00:11:16 <odp-gerritbot> Priyanka Chopra proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11036
00:31:46 <odp-gerritbot> Christopher O'Shea proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11077
01:39:07 <odp-gerritbot> Christopher O'Shea proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11077
01:50:28 <odp-gerritbot> Carol Sanders proposed a change to integration: Adding NetCONF Test Suite  https://git.opendaylight.org/gerrit/11080
01:50:38 <odp-gerritbot> Christopher O'Shea proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11077
02:30:34 <odp-gerritbot> Kamal Rameshan proposed a change to integration: robot integration tests for router rpc in datastore clustering  https://git.opendaylight.org/gerrit/11081
02:31:19 <odp-gerritbot> Hideyuki Tai proposed a change to integration: Added VTN Coordinator to Karaf distribution.  https://git.opendaylight.org/gerrit/11082
02:33:19 <odp-gerritbot> Carol Sanders proposed a change to integration: Adding changes to NETCONF Test Suite  https://git.opendaylight.org/gerrit/11083
05:16:58 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11086
07:09:32 <odp-gerritbot> Rafat Jahan proposed a change to integration: Karaf built and integrated  https://git.opendaylight.org/gerrit/11089
08:45:58 <frankieonuonga> good morning folks
08:47:40 <odp-gerritbot> Peter Gubka proposed a change to integration:     Updating xmls for flows to have unique flow id  https://git.opendaylight.org/gerrit/11091
09:40:42 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11093
09:55:00 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11094
10:20:03 <odp-gerritbot> Peter Gubka proposed a change to integration: A test which connects 256 switches.  https://git.opendaylight.org/gerrit/11095
15:19:00 <odp-gerritbot> A change was merged to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11077
15:55:19 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11111
16:31:42 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11112
16:34:55 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11094
16:55:11 <odp-gerritbot> Basheeruddin Ahmed proposed a change to integration: Initial isolation test integrated with robot framework Usage: pybot -v LEADER:<leader_ip> -v PORT:8080 -v FOLLOWER1:<flwr1_ip> -v FOLLOWER2:<flwr2_ip> ~/integration/test/csit/suites/clustering/datastore/basic/<testname.txt> Adding start/stop controller ut  https://git.opendaylight.org/gerrit/10959
16:55:12 <odp-gerritbot> Basheeruddin Ahmed proposed a change to integration: renamed the test cases file name to contain  proper sequence number  https://git.opendaylight.org/gerrit/11026
16:55:13 <odp-gerritbot> Basheeruddin Ahmed proposed a change to integration: Added library to determine Shard Cluster Roles  https://git.opendaylight.org/gerrit/11016
16:55:14 <odp-gerritbot> Basheeruddin Ahmed proposed a change to integration: UtilLibrary now uses requests session instead of direct post /get which seem to open active connections  https://git.opendaylight.org/gerrit/11113
17:54:49 <odp-gerritbot> Rafat Jahan proposed a change to integration: Adding sdninterfaceapp features  https://git.opendaylight.org/gerrit/11117
18:17:34 <hideyuki_> Hi, committers of Integration Group. I would like you to revew and approve my patch before RC1. https://git.opendaylight.org/gerrit/#/c/11082/
18:34:53 <odp-gerritbot> A change was merged to integration: Added VTN Coordinator to Karaf distribution.  https://git.opendaylight.org/gerrit/11082
19:56:50 <odp-gerritbot> Jamo Luhrsen proposed a change to integration: INPORT action test now functional.  There is a bug open to move that action to IN_PORT, but that may or may not be resolved.  Depending on that, this test case may have to revert back to IN_PORT (bug is https://bugs.opendaylight.org/show_bug.cgi?id=1725)  https://git.opendaylight.org/gerrit/11118
23:22:34 <odp-gerritbot> Priyanka Chopra proposed a change to integration: Adding plugin2oc features  https://git.opendaylight.org/gerrit/11129
06:48:44 <tykeal> Reminder: the integration jenkins will be going offline in less than 15 minutes for a 4 hour outage.
06:50:22 <CASP3R> yea that l2switch was just testing something
06:51:08 <tykeal> I'll let the current test pass through, but I'm disabling the polling and gerrit trigger jobs now so no more should come into the system. Do not trigger any more manual tests
06:55:34 <CASP3R> LuisGomez hey do you want to abandon that job cause the    compatible-min will take 30 mins
06:56:22 <LuisGomez> right, we can drop it
06:57:27 <CASP3R> ok done.
06:57:42 <tykeal> thank you :)
06:58:13 <CASP3R> alright i'm out, have a good change window :P
06:58:21 <tykeal> thanks, get some sleep ;)
10:35:07 <tykeal> jenkins silo is back online, it's currently running a bit sluggish though. I can't fix it at the moment because of an API outage at Rackspace.
18:57:54 <tykeal> LuisGomez: any idea why the failed tests in that latest build are trying to pull a SNAPSHOT artifact from the release repos?
18:59:16 <LuisGomez> really?
18:59:20 <LuisGomez> let me see that
19:01:03 <tykeal> yeah, we're running into it with the move to the new environment because to help with artifact movement around the environments we use a different nexus server (proxy to our main one which is in a different DC). As such all artifact retrieval is forced to that repo for either stuff out of the master release view or out of the snapshot repo...
19:01:27 <LuisGomez> ok, i think integration PAX-EXAM fails because of this line:
19:01:29 <LuisGomez> 2014-09-13 18:51:18,818 | WARN  | n(3)-10.30.11.17 | AetherBasedResolver              | 5 - org.ops4j.pax.url.mvn - 1.6.0 | Error resolving artifactorg.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT:Could not find artifact org.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opendaylight.org:8081/nexus/content/groups/public/)
19:01:30 <LuisGomez> org.sonatype.aether.resolution.ArtifactResolutionException: Could not find artifact org.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opendaylight.org:8081/nexus/content/groups/public/)
19:01:43 <LuisGomez> is this what you are saying?
19:01:48 <tykeal> yes, but that's the release view repo and not the snapshot repo
19:01:56 <LuisGomez> ok
19:01:57 <tykeal> of course it won't find a SNAPSHOT artifact there
19:02:27 <LuisGomez> so i need ed to figure out why this project fetches from wrong place
19:02:56 <tykeal> LuisGomez: the forced repos configuration we're using looks a bit like what's defined at the bottom of this: https://wiki.opendaylight.org/view/Infrastructure:Nexus
19:03:35 <tykeal> you'll see that we basically say, unless the artifact is supposed to be pulled from opendaylight.snapshot look at our release meta repo
19:04:59 <tykeal> I could disable the forced repo configs (like it was in LF) but it would a) have to traverse 1/2 the continent to get resources (adding time) and b) wouldn't let us find things like this that are somehow broken ;)
19:05:54 <LuisGomez> ok, just for me to understand this is an issue in the snmp project pom file right?
19:06:21 <tykeal> umm... I have to assume so since it's an smp4sdn component
19:06:30 <LuisGomez> i think so
19:06:52 <LuisGomez> edwarnicke, are you there?
19:07:11 <tykeal> the thing is, their build silo has been under this restriction for some time, so unless they aren't doing some testing that would have exposed it, it should have already been fixed
19:07:14 <edwarnicke> LuisGomez: Yes :)
19:07:47 <LuisGomez> integration build fails because snmp is fetching artifact from wrong place apparently
19:08:02 <LuisGomez> 2014-09-13 18:51:18,818 | WARN  | n(3)-10.30.11.17 | AetherBasedResolver              | 5 - org.ops4j.pax.url.mvn - 1.6.0 | Error resolving artifactorg.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT:Could not find artifact org.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opendaylight.org:8081/nexus/content/groups/public/)
19:08:03 <LuisGomez> org.sonatype.aether.resolution.ArtifactResolutionException: Could not find artifact org.opendaylight.snmp4sdn:plugin-shell:jar:0.1.3-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opendaylight.org:8081/nexus/content/groups/public/)
19:08:05 <tykeal> build isn't failing... it's unstable ;) tests failing
19:08:06 * edwarnicke reads the back thread
19:08:18 <LuisGomez> correct PAX-EXAM does not pass
19:08:28 <LuisGomez> build is OK
19:08:31 <edwarnicke> LuisGomez: So the wiring tests are failing?
19:08:38 <LuisGomez> yes
19:08:41 <edwarnicke> in integration/features/ ?
19:08:47 <LuisGomez> let me post the console
19:08:54 <edwarnicke> LuisGomez: Thanks :)
19:09:01 <tykeal> https://jenkins.opendaylight.org/integration/view/Polling%20Jobs/job/integration-master-project-centralized-integration/2385/org.opendaylight.integration$features-integration/testReport/installFeature%28org.opendaylight.yangtools.featuretest.SingleFeatureTest%29%5BrepoUrl_%20file__opt_jenkins-integration_workspace_integration-master-project-centralized-integration_features_target_classes_features.xml,%20Feature_%20odl-integra
19:09:01 <LuisGomez> https://jenkins.opendaylight.org/integration/view/Integration%20jobs/job/integration-master-project-centralized-integration/2386/consoleFull
19:09:15 <tykeal> mine just links to one of the errors ;)
19:09:55 <edwarnicke> Ah
19:09:56 <LuisGomez> tykeal, your link does not work
19:09:56 * tykeal notes that the artifact resolution is failing since it's trying to pull a SNAPSHOT artifact from a release repo
19:09:56 <edwarnicke> OK
19:10:01 <edwarnicke> This isn't a wrong repo thing
19:10:21 <edwarnicke> Or at least I am pretty sure its not
19:10:48 <edwarnicke> I believe this is a 'SNMP4SDN added a new bundle to their features.xml file and not their features/pom.xml
19:10:49 <edwarnicke> '
19:10:50 <edwarnicke> problems
19:10:53 <tykeal> I verified that the artifact(s) in question do exist in the nexus01 proxy in the opendaylight.snapshot repo
19:11:03 <edwarnicke> https://git.opendaylight.org/gerrit/#/c/11133/
19:11:30 <edwarnicke> tykeal: I would expect they do
19:11:35 <edwarnicke> tykeal: Let me explain what's happening
19:11:44 <tykeal> ah, and their job wouldn't fail because they were actually building the artifacts in question
19:12:04 <LuisGomez> there you go
19:12:25 <edwarnicke> karaf looks for artifacts in the local .m2 or in well known repos like central (or I think in places defined as release repos in settings.xml)
19:12:25 <tykeal> in other words, we caught an actual error...
19:12:31 <edwarnicke> tykeal: YES :)
19:12:46 <edwarnicke> While I would prefer *not* to use integration as a test case for this problem
19:12:58 <LuisGomez> the snmp folks are active now?
19:13:03 <edwarnicke> I have not had the time (and probably won't have the time) to write the test that locks this one down
19:13:03 <edwarnicke> )
19:13:10 <tykeal> ok, was just seriously worried since all the runs to completion of this job since moving to Rackspace are UNSTABLE
19:13:39 <LuisGomez> just conincidence tykeal
19:13:46 <edwarnicke> (we also catch some other very subtle bugs in autorelease... just pushed a fix for one this morning... although I *can't* think of a way to write a test short of autorelease to catch the problems only it catches)
19:13:47 * tykeal feels better
19:13:58 <edwarnicke> tykeal: Your infra is doing its normal awesomeness
19:14:07 <tykeal> ok
19:14:20 <edwarnicke> tykeal: The most we can accuse you of is having the foresite to set up the infra in a way here that facilitates catching bugs of these kinds ;)
19:14:33 <tykeal> hehe
19:15:01 * edwarnicke amusing faux glare ;)
19:15:12 <edwarnicke> LuisGomez: So we have to figure out what to do about it
19:15:32 <edwarnicke> I think the options are this:
19:15:39 <tykeal> so, I don't know if the tests that are failing would add to the build time significantly but I notice that even with the UNSTABLE these builds are _must_ faster for this job. Previously they were ~45 minutes all of these UNSTABLE builds this morning have been ~10 minutes
19:15:55 <edwarnicke> tykeal: Victory :)
19:16:22 <edwarnicke> So LuisGomez, here's what I see as the decision tree from here:
19:16:39 <edwarnicke> Decision1: Who creates the fix patch for snmp4sdn
19:16:45 <edwarnicke> Decision1.Option1: I do
19:16:54 <edwarnicke> Decision1.Option2: We email Christine and ask her to
19:17:13 <LuisGomez> thats all we can do right?
19:17:13 <edwarnicke> Decision2: What do we do about the breakage until SNMP4SDN merges a patch
19:17:30 <edwarnicke> Decision2.Option1: We let everything stay broken till SNMP4SDN fixes itself
19:17:50 <edwarnicke> Decision2.Option2: We comment out SNMP4SDN in integration, and in autorelease until the fix patch is merged
19:18:08 <LuisGomez> ok thats not bad either
19:18:09 <edwarnicke> LuisGomez: Well... you or tykeal could write the fix for Decision1 ;)
19:18:25 <edwarnicke> LuisGomez: It might even be educational ;)
19:18:36 * tykeal doesn't know what would need to be done
19:18:57 <edwarnicke> tykeal: <joking>See... educational ;) </>
19:19:01 * edwarnicke is showing his sgml roots
19:19:04 <tykeal> bad time for the education ;)
19:19:18 <edwarnicke> tykeal: LOL... an argument to which I am utterly sympathetic :)
19:19:19 <LuisGomez> there is some wiki on how to do that, i can take a look
19:19:40 <edwarnicke> LuisGomez: Is 'there is some wiki on how to do that'... was that a question or a statement?
19:19:52 <LuisGomez> statement
19:19:55 <LuisGomez> i am sure
19:19:57 <edwarnicke> Ah :)
19:19:59 <edwarnicke> Cool :)
19:20:09 <edwarnicke> Do you know where that wiki on how to do it is?
19:20:16 <LuisGomez> i can find it yes
19:20:25 <edwarnicke> Cool... ping me if you have trouble finding it
19:20:28 <LuisGomez> karaf step by step or similar
19:21:03 <edwarnicke> LuisGomez: And one Decision2, do you want Decision2.Option1 (leave things broken) or Decision2.Option2 (comment out SNMP4SDN until they merge the fix) ?
19:22:15 <LuisGomez> i was investigating the restconf issue with a local karaf installation so i do not need integration running right away
19:22:39 <LuisGomez> but if people push patches in the weekend and we want to do some test...
19:23:01 <edwarnicke> LuisGomez: OK... I was going to be preparing some patches there today to integration to clean up some small things that can cause subtle but imporatant problems
19:23:14 <edwarnicke> LuisGomez: I expect to see a bunch of folks scrambling this weekend
19:23:24 <LuisGomez> so then option 2 as well
19:23:40 <edwarnicke> OK... do you want to prepare the patch for Option2 there and I'll review it?
19:23:54 <LuisGomez> yes, that first so we can get going
19:24:01 <LuisGomez> i will do right away
19:24:03 <edwarnicke> Also... you should probably email snmp4sdn-dev and Christine letting her know
19:24:05 <edwarnicke> LuisGomez: Thank you :)
19:24:11 <edwarnicke> LuisGomez: Ping me when you need a review :)
19:24:12 <LuisGomez> ok
19:24:15 <LuisGomez> ok
19:24:28 <edwarnicke> tykeal: Does the overall root issue make sense to you?
19:24:37 <tykeal> edwarnicke: I believe so
19:24:46 <edwarnicke> tykeal: I ask, because you spread a whole lot of sane around generally... and so its helpful for you to understand ;)
19:24:50 <edwarnicke> tykeal: Cool :)
19:24:59 <tykeal> as I said, I was just worried that it was a brokenness in the migrated environment
19:25:02 <edwarnicke> LuisGomez: Just to let you know he subtle thing I'm poking at right now (actually, two things)
19:25:49 <LuisGomez> ok
19:25:50 <edwarnicke> 1)  We have some things we need that are not copied into system/  but as they are in the maven central repo this does not *break* us... but does produce very slow startup, and *would* break someone operating in an offline mode where they couldn't reach maven central... so I'm going to fix that
19:25:57 <tykeal> also, because that job is finishing UNSTABLE it means it hasn't triggered any of the jobs downstream of it so the rest of the environment interconnect hasn't been validated :-/
19:26:50 <edwarnicke> 2)  We had a lot of cases this week of folks seeing bugs that had to do with stale snapshots of stuff in their local .m2 cache, and karaf grabbing those.  This makes addressing bugs very had at times.  So I was going to cause the integration karaf distro at least to *not* look at the local .m2
19:27:27 <LuisGomez> i believe 1) is related with the fact that current karaf distro cannot start stand-alone
19:27:27 <edwarnicke> tykeal: I am strongly in favor of us making sure the new environment is all working... because I am going to guess you won't feel at ease till you know it is (and I want you to feel at ease ;) )
19:27:44 <edwarnicke> LuisGomez: Wait... were are you seeing it not start standalone?
19:27:48 <tykeal> well, I have very high confidence but... yeah
19:27:59 <edwarnicke> LuisGomez: Because it *should* start standalone as long as there is network connectivity to maven central
19:27:59 * tykeal loves puppet managed systems
19:28:14 <LuisGomez> edwarnicke, remember the guava issue?
19:28:22 <edwarnicke> Yes, but didn't we fix that?
19:28:27 <LuisGomez> i think i saw it in latest distro
19:28:33 <LuisGomez> i can retry
19:28:38 * edwarnicke is curious to understand :)
19:29:10 <edwarnicke> LuisGomez: OK... either way... I am going to absolutely make sure *everything* is in there as long as folks don't bork their features/pom.xml files (which I can't work around easily)
19:29:44 <edwarnicke> LuisGomez: For Lithium I think I can get a patch to the karaf guys for their maven plugin that can construct the system/ directory by walking the features files... which should make this all much easier, and also make the zip much smaller
19:29:51 * tykeal manually triggers integration-master-csit-karaf-compatible-min
19:30:06 * edwarnicke knows how to solve more problems than he has time to do before Helium ;(
19:30:16 <LuisGomez> also for 2) i think CASP3R told me we are clearing m2 cache every time we deploy karaf distro in integration Jenkins
19:30:19 <LuisGomez> yes
19:30:42 <LuisGomez> otherwise there are issues
19:31:05 <tykeal> LuisGomez: we're clearing just the org/opendaylight portion of m2 cache which the part we really need to
19:31:22 <LuisGomez> ok
19:31:27 <LuisGomez> that is then
19:31:46 <LuisGomez> ah you helped CASP3R  :)
19:32:28 <tykeal> LuisGomez: more like he did it and I had looked over his script and gave it a thumbs up ;)
19:32:31 <edwarnicke> LuisGomez: That is helpful for integration
19:32:41 <edwarnicke> LuisGomez: It doesn't help for the case where other folks are finding issues :)
19:33:38 <LuisGomez> ok, time to fix the snmp
19:35:31 <LuisGomez> thanks edwarnicke and tykeal
19:35:40 <edwarnicke> Thank you LuisGomez !
19:35:48 <edwarnicke> And tykeal , thank you for being here on a weekend to help out :)
19:36:28 <tykeal> hrmm... I'm somewhat concerned with this: https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-compatible-min/15/console
19:36:34 <tykeal> that's a lot of connection refused
19:38:00 <LuisGomez> the existing karaf distro might be broken
19:38:06 <tykeal> :-/
19:38:26 <LuisGomez> lets fix the integration removing the snmp and recheck
19:39:03 <tykeal> ok... well, I'll just let this job run to completion
19:39:26 <tykeal> or should I just cancel it and when you get the snmp bits removed and we'll just let everything flow?
19:39:56 <tykeal> oh score, something passed:
19:39:57 <tykeal> Karaf-All.MD SAL NSF OF13 :: Test suite for MD-SAL NSF mininet OF13   | PASS | 0 critical tests, 0 passed, 0 failed 23 tests total, 0 passed, 23 failed
19:40:02 <LuisGomez> you can stop it if you want
19:40:31 <tykeal> ok, that was really all the confirmation I really needed, it lets me know that things are operating appropriately enough to both pass and fail tests :)
19:41:25 <tykeal> oh wait, I misread that. It was actually a bunch of failed tests :-/
19:42:47 <tykeal> heh, all the PASS on the tests are from RESTCONF tests and looking over the counts it's because there aren't actually any tests... test count 0, easy to pass that...
19:42:49 <LuisGomez> lets say so far i cannot blame the migration for all the issues we are having, i will let you know otherwise  :)
19:42:50 <edwarnicke> LuisGomez: What memory settings for Karaf are you using?
19:43:11 <edwarnicke> LuisGomez: I am going to set something reasonable globally
19:43:18 <LuisGomez> let me see i change that after talking to you
19:43:53 <tykeal> FYI all of your systems are 8cpu x 8G RAM systems now
19:43:53 <edwarnicke> LuisGomez: Because its not cool that I get permgen errors when trying compatible-with-all in RC0 just because of so many bundles being loaded... need to fix that
19:43:54 <LuisGomez> export JAVA_OPTS="-Xmx2048m -XX:MaxPermSize=512m"
19:44:00 <edwarnicke> tykeal: :) :) :)
19:44:08 <edwarnicke> LuisGomez: Cool :)
19:44:14 <tykeal> well, all the integration systems ;)
19:44:52 <LuisGomez> so we are priviledged  :)
19:45:13 <tykeal> something like that
19:45:27 <LuisGomez> i will not tell anybody…
19:45:30 <tykeal> heh
19:46:27 <tykeal> when we dynamically launch builders they get 8x8 systems as well, but not everyone is doing stuff like that right now. So we've got some masters on 2x2 and 4x4 depending upon what their actual observed usage was like before they migrated
19:46:56 <LuisGomez> makes sense
19:48:57 <tykeal> ok, so, for right now I'm going to assume that the env is good. give me a ping if you want me to check something though
19:49:13 <LuisGomez> right
19:49:25 <LuisGomez> will do
19:49:40 <LuisGomez> unsolicited ping?
19:49:54 <tykeal> LuisGomez: that will be fine ;)
19:50:07 * edwarnicke casts around for his ping clothes
19:52:52 <odp-gerritbot> Luis Gomez proposed a change to integration: Removing SNMP feature dues to integration issues. Will be back when resolved.  https://git.opendaylight.org/gerrit/11145
19:53:23 <LuisGomez> ok lets see if the patch passes the tests
19:53:32 <edwarnicke> LuisGomez: :)
19:55:39 <LuisGomez> oh, i removed the feature repo instead of the feature itself, need to file second patch
19:55:50 <edwarnicke> LuisGomez: OK :)
19:57:53 <odp-gerritbot> Luis Gomez proposed a change to integration: Removing SNMP feature due to integration issues. Will be back when resolved.  https://git.opendaylight.org/gerrit/11145
21:09:35 <edwarnicke> tykeal: Still around?
21:09:41 <tykeal> edwarnicke: yes
21:12:42 <tykeal> edwarnicke: what can I do for you?
21:12:59 <edwarnicke> tykeal: So... I am in the process of making a change that will fix two issues
21:13:19 <edwarnicke> 1) It will preclude things like the snmp4sdn bug we just hit in integration (by forcing the breakage to the snmp4sdn verify job)
21:13:44 <edwarnicke> 2)  It will make it easier to avoid non-heisenbugs were folks are getting stale artifacts at runtime from their .m2 cache
21:13:47 <edwarnicke> But there's a cost...
21:13:57 <tykeal> oh?
21:13:59 <edwarnicke> It means all the local karaf distros will be big like integrations
21:14:04 <edwarnicke> (not *as* big... but still big)
21:14:21 <edwarnicke> So it seemed to me I should *probably* at least mention that to you first ;)
21:14:31 <tykeal> umm... yeah, thanks for the info
21:14:41 <edwarnicke> It somehow felt unfriendly to *surprise* you with a sudden shift in disk usage
21:14:52 * edwarnicke tries to practice the principle of least surprise...
21:15:11 <tykeal> as an FYI in the last 2 weeks the nexus usage went from ~75G of data (for snapshots, releases and all proxied artifacts) to nearly 200G...
21:15:33 * tykeal had to add more disk to the system yesterday because of it
22:36:53 <LuisGomez> edwarnicke
22:37:08 <LuisGomez> you can merge integration patch https://git.opendaylight.org/gerrit/#/c/11145/
22:38:23 <LuisGomez> it takes 1.5 hours to build integration now
22:39:37 <odp-gerritbot> A change was merged to integration: Removing SNMP feature due to integration issues. Will be back when resolved.  https://git.opendaylight.org/gerrit/11145
22:39:44 <edwarnicke> Merged
22:39:51 <edwarnicke> Do we know *why* its now taking 1.5 hours?
22:39:57 <edwarnicke> What was it taking before the migration to rackspace?
22:40:13 <edwarnicke> tykeal: Question... we we know what nature of disk IO we have?
22:40:23 <edwarnicke> tykeal: Was wondering if that might be slowing things down
22:40:38 <LuisGomez> just before rackspace was taking long as well but never figure out how long because jobs were timing out
22:41:44 <LuisGomez> [INFO] Reactor Summary:
22:41:44 <LuisGomez> [INFO]
22:41:46 <LuisGomez> [INFO] OpenDaylight Integration Project .................. SUCCESS [1.945s]
22:41:47 <LuisGomez> [INFO] OpenDaylight Distributions ........................ SUCCESS [0.347s]
22:41:49 <LuisGomez> [INFO] OpenDaylight Base Edition ......................... SUCCESS [1:43.872s]
22:41:50 <LuisGomez> [INFO] Opendaylight Virtualization Edition ............... SUCCESS [23.291s]
22:41:52 <LuisGomez> [INFO] OpenDaylight Service Provider Edition ............. SUCCESS [34.176s]
22:41:53 <LuisGomez> [INFO] OpenDaylight Toaster Edition ...................... SUCCESS [12.878s]
22:41:54 <LuisGomez> [INFO] features-integration .............................. SUCCESS [1:27:11.681s]
22:41:55 <LuisGomez> [INFO] distribution-karaf ................................ SUCCESS [1:02.077s]
22:42:10 <LuisGomez> feature test takes all the time i guess
23:08:46 <tykeal> edwarnicke: it was taking ~45 minutes before rackspace. It was taking ~10 minutes in rackspace before removing the snmp bits
23:09:01 <tykeal> as for the I/O the disks at rackspace are faster than what we have in LF
23:09:37 <tykeal> I don't have hard numbers for comparison, but I do know that we're getting better I/O out of them. In most cases we're on SSD
23:09:58 <tykeal> whereas at LF we were on SAS or SATA in worst case
23:14:55 <tykeal> FYI I'm seeing the following error in the karaf.log on the controller system:
23:14:56 <tykeal> 2014-09-13 23:04:37,605 | WARN  | Event Dispatcher | AetherBasedResolver              | 5 - org.ops4j.pax.url.mvn - 1.6.0 | Error resolving artifactorg.opendaylight.integration:features-integration:xml:features:
23:14:56 <tykeal> 0.2.0-SNAPSHOT:Could not find artifact org.opendaylight.integration:features-integration:xml:features:0.2.0-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opendaylight.org:8081/nexus/content/groups/public/
23:14:56 <tykeal> )
23:14:56 <tykeal> org.sonatype.aether.resolution.ArtifactResolutionException: Could not find artifact org.opendaylight.integration:features-integration:xml:features:0.2.0-SNAPSHOT in nexus-release-mirror (http://nexus01.dfw.opend
23:14:57 <tykeal> aylight.org:8081/nexus/content/groups/public/)
23:15:10 <tykeal> looks like it's trying to find something where it doesn't belong as well...
23:18:31 <LuisGomez> yes
23:18:36 <LuisGomez> i see this error too
23:18:50 <LuisGomez> thats why tests do not pass now
23:19:08 <tykeal> I'm going to hazard a guess it's why the controller doesn't actually start listening on the ports its supposed
23:19:16 <LuisGomez> sure
23:19:28 <LuisGomez> the controller does not start with this exception
23:20:04 <LuisGomez> the karaf log is way longer than what we are getting now
23:20:34 <LuisGomez> btw tykeal
23:20:54 <LuisGomez> it is possible that with karaf we only need 1 controller deploy job
23:21:32 <LuisGomez> only difference between jobs now is features to deploy + sleep time
23:21:33 * tykeal isn't the one that replicated the job everywhere ;)
23:22:11 <LuisGomez> before with old distro was more difficult but now we have a single distro
23:23:30 <LuisGomez> i will check how feasible it is when i get some time
23:53:46 <LuisGomez> tykeal, this is weird, i have no issues deploying the karaf edition is failing in my laptop
23:53:59 <LuisGomez> i do not get the above error
23:54:37 <tykeal> it's not failing?
23:54:44 <LuisGomez> i will retry cleaning everything
23:54:59 <LuisGomez> .m2 and .karaf
23:55:09 <LuisGomez> just in case i missed that
23:57:17 <tykeal> LuisGomez: can you try with the following ~/.m2/settings.xml : http://pastebin.com/uZPV4p3r
23:57:38 <tykeal> that will mirror what we're doing in Rackspace, just using the master nexus since you can't reach the nexus proxy we're using
23:58:21 <tykeal> correction to it: http://pastebin.com/gkGePUgZ  (I missed a couple of characters in my original paste)
23:58:22 <LuisGomez> ok
23:58:27 <LuisGomez> yes
00:06:03 <LuisGomez> correct it fails with these settings
00:06:31 <LuisGomez> tykeal, are these settings correct?
00:07:26 <tykeal> LuisGomez: a modified version that uses the private nexus repo is what every project in Rackspace has been using for over a month now
00:07:42 <LuisGomez> ok
00:08:40 <tykeal> what this file does is forces a hard repository separation between snapshot artifacts and release artifacts. If something is somehow misconfigured / misidentified as a release it causes maven to look in the release repo... which is what we're seeing
00:09:24 <tykeal> I could remove the configuration from the integration lab if you want, but things misidentifing themselves is a bug
00:18:41 <LuisGomez> no leave it like that then until edwanicke takes a look
00:25:43 <edwarnicke> Reading log
00:26:41 <edwarnicke> tykeal: That's not really an error from nexus repo issues
00:27:01 <edwarnicke> tykeal: Its the final error that occurs because that artifact was not installed in the local .m2
00:27:14 <edwarnicke> (or in our case ${WORKSPACE}/.m2repo )
00:27:36 <edwarnicke> tykeal: Which is to say, its not *really* looking for it there
00:28:28 <edwarnicke> tykeal: LuisGomez Is this on *launching* the controller, or in line for *building* it?
00:28:44 <tykeal> edwarnicke: it's on the system that is launching the controller
00:29:03 <edwarnicke> tykeal: Cool... does anyone have a link the controller we are trying to run there?
00:29:06 <tykeal> the deploy job grabs the tarball from the build job, extracts it and then runs it
00:29:15 <edwarnicke> Got a link to that tarball or zip?
00:29:27 <LuisGomez> i have link, hold
00:29:53 <edwarnicke> I'll download, look at it, and live blog^H^H^H^H chat what I look at and the process I use to look into it
00:30:12 <LuisGomez> wget https://jenkins.opendaylight.org/integration/view/Verify%20Jobs/job/integration-master-verify-distributions/lastSuccessfulBuild/artifact/distributions/extra/karaf/target/distribution-karaf-0.2.0-SNAPSHOT.zip
00:30:14 <tykeal> edwarnicke: the deploy process is scripted here:
00:30:31 <tykeal> and that ^ would be the current deploy package ;)
00:30:45 <tykeal> here's the job in question: https://jenkins.opendaylight.org/integration/view/Deploy%20Jobs/job/integration-master-deploy-controller-latest-karaf-compatible-all/configure
00:30:54 <tykeal> all the deploy jobs basically do this shell
00:31:01 <edwarnicke> Downloading
00:31:32 <edwarnicke> Glanced at the script
00:31:47 <edwarnicke> But this is going to be more about the zipfile
00:31:52 <edwarnicke> Because if your script were borked
00:32:01 <edwarnicke> It wouldn't get as far as that error :)
00:32:01 <tykeal> also edwarnicke if it isn't actually trying to do the download during the startup, why does it fail with for LuisGomez using our modified settings.xml but not without it?
00:32:07 <edwarnicke> tykeal: OK
00:32:16 <edwarnicke> So for the controller run, this is a bit different
00:32:34 <edwarnicke> tykeal: There is a prepackaged mvn repo in system/
00:32:41 <edwarnicke> tykeal: It should have everything we need
00:33:15 <tykeal> ok, then the settings.xml shouldn't matter... that is of course, if it isn't trying to download something
00:33:16 <edwarnicke> tykeal: karaf only looks in your local .m2, in well known repos  like central, and if you have something in settings.xml, it will look there
00:33:26 <edwarnicke> tykeal: apparently it is not respecting snapshots there ;)
00:33:31 <tykeal> ah
00:33:37 <edwarnicke> tykeal: So if things are *correct*
00:33:46 <edwarnicke> tykeal: It should have everything it needs in system/
00:33:51 <edwarnicke> So let me look at that
00:33:54 <tykeal> ok
00:33:56 <tykeal> thanks :)
00:34:09 <edwarnicke> tykeal: So you are not crazy.. its just weird
00:34:25 <edwarnicke> tykeal: And this is, once again, revealing (probably) something really broken, not an infra issue
00:34:36 <LuisGomez> yep
00:34:41 <tykeal> ahh... you took that away from me? I was hoping that the tinfoil hat would look good ;)
00:35:30 <tykeal> I dare say, this move to rackspace for integration sure has uncovered some weird issues
00:35:33 * edwarnicke ponders where to find a really fashionable tinfoil hat
00:35:43 * edwarnicke thinks he knows the right artist
00:36:15 <tykeal> http://media-cache-ak0.pinimg.com/736x/e3/ba/86/e3ba8639b16e292922bf6df43c9de28c.jpg
00:36:20 <tykeal> nice tinfoil fedora ;)
00:36:27 <LuisGomez> all that is happenning today in integration is probably nothing to do with the move, i am sorry for tykeal  :)
00:36:33 <edwarnicke> tykeal: Exactly :)
00:36:41 <edwarnicke> LuisGomez: We'll see :)
00:36:51 <edwarnicke> OK
00:36:53 <edwarnicke> I downloaded
00:36:54 <edwarnicke> unziped
00:37:03 <edwarnicke> did a quick find through system/ for features-integration
00:37:09 <edwarnicke> saw something that looked roughly right
00:37:16 <edwarnicke> rm -rf ~/.m2
00:37:18 <LuisGomez> ok
00:37:19 <edwarnicke> now running
00:37:21 <edwarnicke> cd bin/
00:37:22 <edwarnicke> ./karaf
00:37:30 <edwarnicke> Runs locally
00:37:32 <LuisGomez> and…
00:37:39 <edwarnicke> Now checking for things in ~/.m2/
00:37:40 <LuisGomez> hold a bit
00:37:51 <LuisGomez> see if you get the int feature installed
00:38:03 <LuisGomez> karaf starts but feature is not found
00:38:13 <edwarnicke> *oh*
00:38:13 <LuisGomez> with tykeal settings
00:38:16 <edwarnicke> Which feature should I install?
00:38:36 <LuisGomez> odl-integration-compatible-with-all
00:39:03 <edwarnicke> Installing
00:40:00 <LuisGomez> it also works locally for me but not when i use tykeal .m2/settings.xml
00:40:17 <edwarnicke> tykeal: Can you give me a laundered .m2/settings.xml to use?
00:40:25 <tykeal> http://pastebin.com/gkGePUgZ
00:40:39 <tykeal> we use a modified version of that for all systems in rackspace
00:40:53 <tykeal> modifed to point to the internal proxy that is ;)
00:41:03 <tykeal> and also have the deployment user info...
00:41:30 <edwarnicke> tykeal: OK... could you also pastebin me an example of the settings.xml you used to use in LF for integration?
00:41:37 <edwarnicke> tykeal: Just so I can eyeball compare
00:43:03 <edwarnicke> Starting again with empty ~/.m2/repository, ~/.m2/settings.xml, and a fresh unpack of the zip file
00:43:11 <tykeal> edwarnicke: here http://pastebin.com/XfpvaQYJ
00:43:37 <tykeal> as I said, the only real difference is that we have a) user push info and b) we point to the local nexus proxy
00:44:32 <edwarnicke> Is this the one for the Rackspace: http://pastebin.com/gkGePUgZ ?
00:44:43 <LuisGomez> tykeal, repo issues as well in central job:
00:44:45 <LuisGomez> https://jenkins.opendaylight.org/integration/view/Integration%20jobs/job/integration-master-project-centralized-integration/2388/console
00:44:47 <tykeal> edwarnicke: the second one that I pasted is
00:45:14 <edwarnicke> Ah.. and the first one is from LF ?
00:45:29 <tykeal> for general consumption: http://pastebin.com/gkGePUgZ
00:45:29 <tykeal> redacted one for rackspace: http://pastebin.com/XfpvaQYJ
00:45:53 <tykeal> edwarnicke: we didn't do that when running inside LF, I started doing this for systems inside rackspace to force them to use the local proxy
00:45:55 <edwarnicke> LuisGomez: That error looks like network connectivity in infra
00:46:08 <edwarnicke> tykeal: ACK
00:46:40 <tykeal> doing this in rackspace saves ~20 minutes of artifact recovery times in a lot of cases
00:46:47 <edwarnicke> tykeal: LuisGomez I have reproduced the error with the LF settings.xml file
00:47:00 <LuisGomez> ok
00:47:28 <tykeal> edwarnicke: if you think that the settings.xml file is improperly formed I would like to know so I can fix it. This is in use on every build system in rackspace
00:47:44 <edwarnicke> tykeal: I have no conclusions yet
00:47:47 <tykeal> and has been since we started there over a month ago
00:48:08 <tykeal> correction 2.5 months ago ;)
00:48:34 <edwarnicke> tykeal: My current thinking is that its unlikely to be that the file is malformed... rather that something weird is happening, need to figure it out
00:48:52 <tykeal> ok
00:50:27 <edwarnicke> tykeal: Question... where is the settings.xml file on the controller running server at LF, and where is it at rackspace (Filesystem location)
00:50:51 <LuisGomez> edwarnicke is right, http://nexus01.dfw.opendaylight.org:8081/ is not rechable
00:51:01 <tykeal> ~/.m2/settings.xml in all cases
00:51:08 <edwarnicke> tykeal: Good to know
00:51:33 <tykeal> the file is owned by root but readable by jenkins. that way bright light doesn't try futzing with it via a jenkins job
00:51:39 * tykeal actually had a dev do that
00:51:57 <tykeal> before I set the root perms that is ;)
00:52:03 <edwarnicke> tykeal: *sigh*
00:52:05 <edwarnicke> silly dev
00:52:14 <tykeal> it was GiovanniMeo ;)
00:52:27 <tykeal> when the odlautorelease was first getting setup
00:52:45 <edwarnicke> tykeal: If I want you to change settings.xml, I will ask you (so you can tell me why my ideas is questionable ;) )
00:52:52 <tykeal> hehe
00:52:57 <edwarnicke> tykeal: I noticed he just brought his own at the end of the day
00:53:37 <tykeal> hrmm... he had asked us to make changes to the one on odlautorelease, I would expect that it would all he was needing...
00:53:42 * edwarnicke notes his autorelease does not require a space station
00:55:00 <edwarnicke> tykeal: OK... here's what I've tried...
00:55:12 <edwarnicke> I used the LF settings.xml to begin with because I was confused:
00:55:20 <edwarnicke> https://www.irccloud.com/pastebin/om0r3fwc
00:55:27 <edwarnicke> I pastebin it above so we are all on page
00:55:43 <edwarnicke> *that* settings.xml produces the failure
00:55:57 <tykeal> right, that's the one I handed you
00:58:14 <LuisGomez> both central and karaf deploy errors seem related as both complain they cannot rech http://nexus01.dfw.opendaylight.org:8081/
00:58:50 <edwarnicke> tykeal: So thats from LF, the *pre* migration settings.xml, correct?  The one that was *working* before?
00:59:00 <tykeal> well, nobody not inside the rackspace network can reach nexus01.dfw.opendayilght.org
00:59:25 <tykeal> edwarnicke: we _didn't_ use something like that pre-migration all the settings.xml had in it was user info
00:59:42 <edwarnicke> tykeal: *oh*
00:59:54 <tykeal> we've _been_ using this: http://pastebin.com/XfpvaQYJ for all systems in rackspace (~2.5 months)
00:59:58 <edwarnicke> So premigration we didn't have any pointer to repos?
01:00:21 <tykeal> edwarnicke: that's correct, because when the envs were all getting setup I didn't know enough about maven to do this sort of thing
01:00:32 <edwarnicke> tykeal: Could you pastebin me what was used in LF yesterday that was working (with appropriate REDACTIONS of credentials) ?
01:00:55 <tykeal> edwarnicke: remove the mirrors section of that and you'll have it exactly
01:01:11 <edwarnicke> tykeal: Ah... OK
01:01:56 <edwarnicke> tykeal: So... working theory (not tested) is that by having a settings.xml, karaf is trying there instead of the local system/
01:02:04 <edwarnicke> tykeal: Let me poke at that for a moment
01:02:08 <tykeal> ok
01:02:24 <tykeal> as I've said, I _can_  remove the mirrors section from the integration lab if we need to
01:02:38 <tykeal> it's not needed, I'm just trying to save bandwidth and download time
01:04:21 <edwarnicke> tykeal: In a pinch, could you just remove it from the ones running the controller?
01:04:43 <tykeal> yes
01:05:22 <LuisGomez> yes, controller vm jobs
01:06:14 <LuisGomez> edwarnicke, karaf pax-exam is also impacted by this?
01:06:31 <edwarnicke> LuisGomez: No
01:06:31 <LuisGomez> or only when deploying karaf distro
01:06:33 <LuisGomez> ok
01:06:37 <LuisGomez> good to know
01:07:14 <edwarnicke> tykeal: OK... so it looks like with a settings.xml, its deciding to pick that instead of the configured repos, one of which is the system/ directory
01:07:23 <edwarnicke> Now I just need to figure out why and how to fix it :0
01:07:46 <tykeal> ok, just let me know if I should pull it from the vms running the controller for the tests
01:08:19 <LuisGomez> i would say do it if it is not too much work tykeal
01:08:34 <LuisGomez> because now we are stopped at integration
01:08:54 <LuisGomez> we cannot deploy controller  :(
01:10:10 <LuisGomez> unless edwarnicke thinks he can fix this very soon…
01:10:25 <edwarnicke> LuisGomez: I would concur
01:11:38 <tykeal> done
01:11:59 <LuisGomez> ok, lets try
01:15:09 <LuisGomez> yep, it is working now  :)
01:15:28 <tykeal> :-/ silly karaf
01:15:49 <edwarnicke> tykeal: I actually have a bit better understanding of things as well
01:15:49 <tykeal> ok, that one _is_ an infra issue, but only because we don't know why karaf is being silly ;)
01:15:55 <edwarnicke> tykeal: So... you have a mirror config
01:16:19 <edwarnicke> And your mirror config says to try anything that is not a mirror of opendaylight.snapshots on that release mirror (overall, this is a *very* good thing)
01:16:35 <edwarnicke> karaf looks in common repos like central for things
01:16:37 <tykeal> it's the recommendation in the nexus administration guides
01:16:50 <edwarnicke> So your mirror config tells it to look for things in your release mirror it would otherwise look for in central
01:16:59 <edwarnicke> tykeal: Do not mistake me, what you are doing there is pure awesome
01:17:08 <edwarnicke> tykeal: I can't begin to tell you how much I approve
01:17:17 <edwarnicke> tykeal: Like wish we were doing that *everywhere* LF or not
01:17:28 <edwarnicke> (within ODL infra that is)
01:17:32 <tykeal> hehe
01:17:42 <edwarnicke> tykeal: So this is in no way saying you should have done something different
01:17:56 <edwarnicke> tykeal: Just explaining the mystery of WTF is it looking in the release repo for a snapshot
01:18:00 <tykeal> well, I can add in the mirror clause for controller and ovsdb to point to the master nexus (as it's the one close to them). They're only holdouts in the LF infra
01:18:13 <edwarnicke> tykeal: Yeah... that's not the root issue
01:18:14 <tykeal> edwarnicke: your controller dynamic VMs, since they are in rackspace are using the mirror
01:18:22 <edwarnicke> The root issue is that for reasons I'm still investigating
01:18:35 <edwarnicke> having a settings.xml is causing it to ignore the rest of the karaf config about such things
01:18:43 <edwarnicke> Which is a problem I have to solve anyway
01:18:52 <tykeal> ok
01:18:55 <edwarnicke> Which is to say, you found a bug that I'm happy we found instead of a customer
01:18:59 <edwarnicke> Good Job! :)
01:19:18 <tykeal> it would probably be something to get fixed since you never know what freaky admin decides to stick a settings.xml in their system... ;)
01:19:34 <edwarnicke> freaky admins ;)
01:19:48 <edwarnicke> I agree.. its one I'm happy we found first
01:20:07 <tykeal> ok, can you raise the appropriate bug about it then? I'm not exactly certain what should go in it ;)
01:20:26 <edwarnicke> tykeal: Yes, but I need to investigate it a bit more
01:20:30 <tykeal> ok
01:20:32 <edwarnicke> tykeal: It feels like a karaf bug
01:20:36 <tykeal> ah
01:20:39 <edwarnicke> tykeal: Which we may need to workaround
01:20:49 <edwarnicke> tykeal: the good news is, I know those guys
01:20:53 <edwarnicke> And can ask them :)
01:20:55 <tykeal> excellent
01:21:12 <edwarnicke> (they have fixed bugs for me before, and more often than that, they have pointed out the switch I flipped wrong ;) )
01:21:22 <tykeal> in the mean time, I think I need to go track down some dinner. I'm a bit famished
01:21:24 <edwarnicke> tykeal: But the good news is:
01:21:31 <edwarnicke> a)  I have a reasonable hypothesis
01:21:36 <edwarnicke> b)  We have a reasonable workaround
01:21:52 <edwarnicke> tykeal: Were you going to remove the settings.xml from the controller VMs?
01:22:27 <tykeal> I commented out the mirrors section and already made that change. I left the user info in the file as we've had that since time began (so to speak)
01:22:54 <edwarnicke> tykeal: Cool
01:23:12 <edwarnicke> LuisGomez: Can you give it a whirl while tykeal insures we still have a corporeal freakishly awesome sysadmin?
01:23:48 <LuisGomez> yes, deploy works, i just need to pass the test now
01:28:40 <edwarnicke> LuisGomez: tykeal after reading a bit more, and thinking I now understand
01:28:48 <edwarnicke> Let me explain
01:28:50 <tykeal> ok
01:29:10 <edwarnicke> in etc/org.ops4j.pax.url.mvn.cfg
01:29:18 <edwarnicke> karaf defines a list of repos:
01:29:26 <edwarnicke> One of which is:
01:29:34 <edwarnicke> file:${karaf.home}/${karaf.default.repository}@id=system.repository
01:29:37 <edwarnicke> which is the system repo
01:29:43 <edwarnicke> system/
01:29:49 <LuisGomez> ok
01:29:50 <edwarnicke> The one we pre-populate
01:30:03 <LuisGomez> and should contain all
01:30:06 <edwarnicke> Yep
01:30:17 <edwarnicke> But these are being treated as simple *repos*, like any other maven repo
01:30:27 <edwarnicke> In comes tykeal 's settings.xml
01:30:41 <edwarnicke> And it say 'Whatever repo you have, if its not opendaylight.snapshots, use this mirror instead'
01:30:47 <edwarnicke> Since system/ is just another repo
01:30:57 <tykeal> *sigh*
01:30:59 <edwarnicke> karaf obediently asks the mirror instead
01:31:12 <edwarnicke> tykeal: LuisGomez Does that make sense?
01:31:18 <LuisGomez> sure
01:31:36 <tykeal> oh look, the controller actually bound to the ports it was supposed to
01:31:40 <edwarnicke> Now I can hack this
01:32:00 <edwarnicke> Because I *can* put a settings.xml into the karaf distro, and tell the config file to use that as the settings.xml
01:32:10 <LuisGomez> edwarnicke, is there a way to tell karaf ONLY use /system
01:32:12 <edwarnicke> I'm not positive that's the right thing to do yet, need to think about it
01:32:28 <edwarnicke> LuisGomez: Yes
01:32:33 <edwarnicke> LuisGomez: I am pretty sure I can do that
01:32:38 <LuisGomez> cool
01:32:41 <edwarnicke> LuisGomez: I am just not sure I *should* do that
01:32:56 <LuisGomez> now will break everything probably...
01:33:02 <LuisGomez> but we need that for relase
01:33:06 <tykeal> :-/ the job still seems to be failing tests though
01:33:16 <LuisGomez> let me check
01:33:19 <edwarnicke> LuisGomez: I don't think we do
01:33:23 <edwarnicke> LuisGomez: Let me tell you why
01:33:35 <tykeal> still saying no route to host in the tests which seems strange to me
01:33:37 <edwarnicke> So karaf looks at system/ but also common public repos like central
01:33:56 <tykeal> you're a) using IPs b) I've verified those IPs are accessible by all the test lab systems
01:34:00 <edwarnicke> LuisGomez: Users have a very reasonable expectation, that if they do a bundle:install for a bundle that lives in central, it will work
01:34:08 <edwarnicke> If we *force* only using system/ it won't
01:34:42 <edwarnicke> LuisGomez: So I'm going to do this
01:34:55 <edwarnicke> LuisGomez: Let me check the settings.xml override
01:35:00 <edwarnicke> LuisGomez: To make sure it *works*
01:35:07 <edwarnicke> And then we can decide whether we *should* do that
01:35:29 <edwarnicke> At our leisure
01:36:08 <LuisGomez> tykeal: controller at 192.168.4.5 is not reachable for any reason
01:36:56 <LuisGomez> edwarnicke:ok
01:37:03 <tykeal> LuisGomez: hrmm...
01:37:25 <LuisGomez> tykeal: is not reachable from both robot and mininet vms
01:37:38 <tykeal> looking into it now, it should be
01:38:02 <tykeal> hah, I see the issue, one moment
01:40:00 <edwarnicke> LuisGomez: I can force use of a settings.xml inside our zip file if I need to
01:40:04 <edwarnicke> Just verified
01:40:17 <edwarnicke> Again, not sure if I should, would have to think about it
01:40:54 <LuisGomez> ok, lets think about before doing it
01:40:57 <LuisGomez> no rush
01:41:14 <edwarnicke> Yeah... as long as we know it *does* work (as opposed to *thinking* it *may* work ;) )
01:41:26 <edwarnicke> And it makes sense now :0
01:41:28 <edwarnicke> :)
01:43:21 <edwarnicke> LuisGomez: I did discover investigating earlier today that there are about 50 jars that we probably want in system/ that are not there (stuff from outside of ODL).  I am working on making sure they are there, which should speed up startup and feature install a lot
01:43:49 <LuisGomez> sure
01:44:33 <LuisGomez> the important thing is that we should not require internet to install a karaf distro with odl features
01:45:01 <LuisGomez> totally stand-alone distro for the features we define
01:45:24 <LuisGomez> tykeal: let me know when i should retry my test
01:46:28 <edwarnicke> LuisGomez: *yes*
01:46:37 <tykeal> hrmm... strange, I could have sworn I had fully tested the inter connect on the private network
01:46:41 <edwarnicke> LuisGomez: Which is what I am working on fixing :0
01:47:05 <tykeal> even so, it's still failing on me with some very wide open firewall rules
01:47:28 * tykeal considers just routing the traffic over the front side network as that _does_ work
01:48:24 <tykeal> then again, that doesn't seem to be wanting to work right now either
01:48:29 <LuisGomez> tykeal: all the vms have IP in same subnet so no need to route right?
01:48:55 <tykeal> LuisGomez: correct, no routing should actually be involved, just allowing traffic to pass the firewall which it is setup for already
01:49:08 <LuisGomez> are all interfaces up and connected to same bridge?
01:49:20 <LuisGomez> can you do arp -a in the vms?
01:49:44 <tykeal> LuisGomez: they're connected to the same network. I can't say they're on the same bridge. I don't know what the rackspace network looks like ;)
01:49:59 <LuisGomez> :)
01:50:20 <tykeal> I think I may actually need to give them a reboot. Something seems to be a bit off. I just had some weird connectivity issue to the VMs
01:50:31 <LuisGomez> ok, go ahead
01:53:26 <LuisGomez> tykeal: in rackspace you do not manage the vswitches for the VMs? how do you know then a VM interface is well connected to the network?
01:54:17 <LuisGomez> maybe you have a guy like in openstack but that is not the best to troubleshoot an issue  :)
01:54:21 <tykeal> LuisGomez: it's all OpenStack. we define a subnet and say bring a VM up connected to these various networks and it allocates an IP address on the subnet and connects a NIC to the subnet
01:54:54 <LuisGomez> :)
01:55:06 <tykeal> and as of last night I _could_ ssh between the test lab systems accross the front side nics. But right now, even after the reboots I'm getting no route to hose
01:55:09 <tykeal> err.. host
01:55:09 <LuisGomez> so openstack is ready for production  :)
01:55:57 <LuisGomez> well at least they support you in rackspace with this kind of issues right?
01:56:21 <LuisGomez> without access to the host, very difficult to troubleshoot i guess
01:56:28 <tykeal> yeah, I've also got a couple of the VMs I just rebooted not even coming up yet
01:56:38 <tykeal> which is weird
01:58:22 <tykeal> I also find it rather interesting that Jenkins seems to think that they systems may be up...
01:59:01 <tykeal> even though I can't get to them
02:00:57 <LuisGomez> yep
02:01:11 <LuisGomez> ok, i have to leave for a while now
02:01:24 <tykeal> yeah, something is seriously messed up here and I'm going to have to contact Rackspace about it
02:01:26 <LuisGomez> i let you do  :)
02:02:15 <LuisGomez> also do not know if related but the issue with  http://nexus01.dfw.opendaylight.org:8081/
02:02:29 <LuisGomez> central job is blocked by this too
02:02:39 <tykeal> ?
02:03:14 <LuisGomez> https://jenkins.opendaylight.org/integration/view/Integration%20jobs/job/integration-master-project-centralized-integration/
02:03:50 <LuisGomez> this job is failing because it cannot reach above repo
02:04:06 <LuisGomez> this is in the master jenkins vm
02:04:17 <tykeal> yeah, it's related
02:04:35 <LuisGomez> good
02:04:45 <LuisGomez> at least it is the same thing  :)
02:05:38 <LuisGomez> need to go now, i will check later today or tomorrow morning
02:06:09 <tykeal> ok, I'm thinking that Rackspace is having a major network problem. a lot of our VMs are starting to have network issues all of a sudden
02:06:44 <LuisGomez> can be yes
02:07:39 <tykeal> I think I'm going to go find some dinner and come back to it. Hopefully it will have shaken itself out because if many of our systems are having an issue then it will likely be something they are already working on
02:07:49 <edwarnicke> tykeal: EAT! :)
02:08:06 * tykeal waves be back in a while
02:08:16 <edwarnicke> tykeal: Have fun storming the castle!
02:08:26 <LuisGomez> yes, have fun
02:08:59 <tykeal> oh, hey, looks like things may have just corrected themselves
02:09:06 <edwarnicke> Horray!
02:09:49 <tykeal> and yes, I can communicate between systems on both front and back side networks, so things _should_ be working now
02:10:15 <tykeal> I'll come check again after dinner
02:11:58 <LuisGomez> ok, triggering jobs now
02:25:47 <edwarnicke> Good news :)
02:25:57 <edwarnicke> Autorelease with sdni and plugin2oc is working :)
12:46:01 <odp-gerritbot> David Goldberg proposed a change to integration: added sfclisp feature to integration  https://git.opendaylight.org/gerrit/11153
12:46:59 <odp-gerritbot> David Goldberg proposed a change to integration: added sfclisp feature to integration  https://git.opendaylight.org/gerrit/11154
19:00:49 <odp-gerritbot> David Goldberg proposed a change to integration: added sfclisp feature to integration  https://git.opendaylight.org/gerrit/11166
20:46:58 <odp-gerritbot> A change was merged to integration: added sfclisp feature to integration  https://git.opendaylight.org/gerrit/11166
20:51:06 <odp-gerritbot> David Goldberg proposed a change to integration: added sfcofl2 to integration  https://git.opendaylight.org/gerrit/11170
05:53:34 <CASP3R> #endmeeting