#acumos-meeting: Architecture Committee
Meeting started by farheen_att at 14:05:55 UTC
(full logs).
Meeting summary
- Agenda (farheen_att, 14:07:23)
- ML Workbench, DNS compliance issue, Kafka
pivot (farheen_att,
14:12:30)
- microservice generation impact and the overall design (farheen_att, 14:13:05)
- Guy sharing his bullet points (farheen_att,
14:13:25)
- we also want to merge with Acumos with ATT
internal org they use jenkins. (farheen_att,
14:15:22)
- https://wiki.acumos.org/pages/viewpage.action?pageId=26640755
(farheen_att,
14:16:01)
- This is planned for Demeter release not Clio.
There will be code refactoring ms generation. (farheen_att,
14:17:48)
- which other components will be impacted?
(farheen_att,
14:17:58)
- onboarding and portal have to be modified to
call the API in jenkins but everything else should be the
same. (farheen_att,
14:18:30)
- Tausif - This effected module needs to call
jenkins API. What point in time does it need to be called?
(farheen_att,
14:19:10)
- Guy - Right now in both onboarding and portal
there are times when you call the ms generation api. Instead of
calling the ms-generation api you call jenkins api. Minor
modifications. You have to know where the jenkins api is and the
ms-generation api will be turned off. (farheen_att,
14:20:17)
- we will continue to perform outbound calls will
be handled by the runnable jars and look the same. (farheen_att,
14:20:53)
- solution id, revision id, and the user id will
be the parameters to deploy or scan a model. (farheen_att,
14:21:23)
- Tausif - Will this be at the time of onboarding
a model. (farheen_att,
14:21:35)
- Guy yes, if you create a ms. The portal will
call the ms-generation api which will be changed to jenkins.
(farheen_att,
14:22:12)
- Tausif generation of ms is coupled with
onboarding. We are taking it away from onboarding and using
jenkins? (farheen_att,
14:22:36)
- Guy - It's it's own module and can be invoked
from on-boarding. You don't have to involve the portal at all. If
you do use the portal to do web onboarding then you will have to
call jenkins. (farheen_att,
14:23:19)
- Tausif how will we get the information
back? (farheen_att,
14:23:41)
- Exactly the same as today. There is no impact
on that part. (farheen_att,
14:23:57)
- Will we get a performance gain? (farheen_att,
14:24:13)
- No, the speed of ms generation is bounded by
the docker host for fetching (farheen_att,
14:24:43)
- why are we doing this? (farheen_att,
14:24:50)
- docker and docker requires root level access
and it doesn't work in Azure. (farheen_att,
14:25:20)
- docker in docker not docker and docker
(farheen_att,
14:25:44)
- scaling is an additional benefit. (farheen_att,
14:26:44)
- Is there failure handling? Jenkins stops
working after some time for no reason. We need to re-trigger the
job. (farheen_att,
14:27:14)
- Guy - Logging for the build process is
available. We have not addressed it but something to consider. We
create the logs and see what goes wrong. (farheen_att,
14:28:12)
- proper error handling should be put forward
with the proper flow. At the time of user stories we can add the
right flows. (farheen_att,
14:28:40)
- Prya - docker image size can that be optimized
by jenkins? (farheen_att,
14:30:20)
- we can brainstorm and see at the time of build.
What is the target environment and can it be built in that
environment? (farheen_att,
14:31:05)
- it's not the size of the kernal that's killing
us. The model does have. (farheen_att,
14:33:20)
- Sayee - Concerned about docker size
(farheen_att,
14:34:40)
- It's an orthogonal issue building a faster
smaller docker image. (farheen_att,
14:35:00)
- irrespective of docker or not. Inside the ms
generation should we optimize or not? We can work it in
parallel. (farheen_att,
14:35:42)
- ML Workbech modeler user experience (farheen_att, 14:36:32)
- Guy - first thing is if we can't keep the
acumos internal views in synch with jupyter notebooks then it won't
add valuable to integrate with acumos. Changing a notebook will be
lost. (farheen_att,
14:38:11)
- If I noticed these notebooks were not in synch
I would ignore the models. So having Acumos keep track of these I
would not find it valuable. (farheen_att,
14:39:04)
- I would ignore the models in Acumos.
(farheen_att,
14:39:40)
- Bryan - Use git as a back end and
synchronize. (farheen_att,
14:40:07)
- Sayee agrees that there should be tighter
integration. (farheen_att,
14:40:56)
- There is no reason that I would used the
Notebook. Modeler credentials being pre-loaded would be helpful. I
want to just do a push. It knows who I am. (farheen_att,
14:41:45)
- I would also like to see more client libraries.
CMLP has shell access which is good. You can't do that with the ML
libs in jupyter. (farheen_att,
14:42:25)
- Bryan - URIs and credentials are easily
available. (farheen_att,
14:42:46)
- Guy- this is an easy fix. (farheen_att,
14:42:59)
- Bryan - It is fixed. It is easy to do.
(farheen_att,
14:43:13)
- Sayee- As we evolve the advantage is providing
GPUs as the platform so the GPU can be attached on as needed
basis. (farheen_att,
14:43:47)
- This is the system that I tested IST. Sharing
of the code is there. (farheen_att,
14:44:24)
- Guy - we need persistence of user so that when
notebooks crash you don't lose everything. (farheen_att,
14:44:45)
- Guy - Bryan uses persistent volumes and git
these are good solutions. (farheen_att,
14:45:13)
- Guy resource concern. We are pulling and
running all the containers. (farheen_att,
14:45:57)
- Sharing is great when you look at RCloud to
share code. It would be nice to have a community of coders.
(farheen_att,
14:46:31)
- Bryan - It's easy to add packages inside the
notebook using the Python command. (farheen_att,
14:47:21)
- A shell window would be nice. (farheen_att,
14:47:30)
- Jupyter does the same thing. (farheen_att,
14:47:38)
- Bryan - If we want to build our own customzied
jupyter stacked images we can do so. (farheen_att,
14:48:07)
- add this to the etherpad. (farheen_att,
14:50:01)
- ACTION: Manoop - add
this link to the etherpad. (farheen_att,
14:51:00)
- https://wiki.acumos.org/display/AR/Thoughts+on+ML+Workbench+from+a+Modeler%27s+Perspective
(farheen_att,
14:51:36)
- Bring your ideas to face to face meeting. (farheen_att, 14:52:41)
- https://etherpad.acumos.org/p/DemeterPlanningWorkshop
(farheen_att,
14:53:36)
- High level overview of Acumos-2901 (farheen_att, 14:54:14)
- Parag when you are trying to deploy the model
in k8 environment it will fail because it doesn't accept
characters. (farheen_att,
14:55:54)
- it's the name of the container of the pod.
every pod is a domain name inside the cluster. The name of the
model can not have DNS compliant. (farheen_att,
14:56:38)
- we should not restrict the user from creating a
name in DNS acceptable format but change the model name.
(farheen_att,
14:58:12)
- when the user is onboarding we should check the
model meta data such as name. (farheen_att,
14:59:03)
- rather than restricting the user we should
automatically generate the name. (farheen_att,
14:59:31)
- we need a friendly name a DNS compliant name.
A short term solution is to restrict the user until it has been
fixed. (farheen_att,
15:00:49)
- ACTION: Guy lead the
effort to accept a DNS generated name. (farheen_att,
15:04:11)
- Guy - there is code in there already to do
that. (farheen_att,
15:05:19)
- Priya - have a friendly name and a system
name. (farheen_att,
15:05:54)
- everyone agrees (farheen_att,
15:06:03)
- ACTION: Parag convert
Acumos-2901 from an Issue into a User Story in Demeter in
jira. (farheen_att,
15:07:08)
- ML Workbench status (farheen_att, 15:08:42)
- SV and deployment will not have a final docker
by the end of this week. Minimum the end of next week. (farheen_att,
15:12:06)
- there may be a gap in how to deploy.
(farheen_att,
15:13:05)
- we will figure it out in the integration
cycle. (farheen_att,
15:13:18)
- if development is ready then testing can start.
If not then issue should be raised. (farheen_att,
15:13:38)
Meeting ended at 15:14:07 UTC
(full logs).
Action items
- Manoop - add this link to the etherpad.
- Guy lead the effort to accept a DNS generated name.
- Parag convert Acumos-2901 from an Issue into a User Story in Demeter in jira.
People present (lines said)
- farheen_att (81)
- collabot` (3)
Generated by MeetBot 0.1.4.