#acumos-meeting: Architecture Committee
Meeting started by farheen_att at 14:03:19 UTC
(full logs).
Meeting summary
- ML Workbench - Sayee and team (farheen_att, 14:07:35)
- Sayee reviewing ML workbench. Using Angular.
Advantage is easy to build a UI with plugin approach. Each module
can run in a pod. Problem: today we deploy to single VM.
Enterprises have their own set of pipelines. Solution is to make an
admin who can configure their acumos instance called a config
manager. (farheen_att,
14:14:01)
- storing meta data for model mapping in couchdb.
Similar content will be mapped in config management. (farheen_att,
14:15:50)
- where do we have this manager. Is it an
addendum to the existing Admin or stand alone config manager.
(farheen_att,
14:17:10)
- needs to be discussed. support and guidelines
for k8 is driven by what we co host on the platform. It takes a
long time to launch a pipeline or notebook. Advantage with noSQL is
we are integrating with couchdb. (farheen_att,
14:21:19)
- where will the config manager be? in the admin
ui or as a tile in design studio. (farheen_att,
14:23:05)
- model association - config manager will be able
to configure how the model is shared across two instances of Acumos
through E5. (farheen_att,
14:25:31)
- linking is done through Nifi. The association
is not stored in Nifi. We have to train the data scientist to use.
Learning curve. (farheen_att,
14:29:16)
- data set is a data set of meta data. Such as
attributes. (farheen_att,
14:29:53)
- where is the history? (farheen_att,
14:30:02)
- where is the record of the data association to
the model (farheen_att,
14:30:35)
- Nifi gives you data as well as
provenance. (farheen_att,
14:30:49)
- Does Nifi give you provenance out of the
box? (farheen_att,
14:31:05)
- ACTION: Sayee will
talk thru the gaps of the details around data sets and
provenance. (farheen_att,
14:31:57)
- Priya - you have a data set id and it's
physical location. As a part of onboarding I can specify the model
and sample data set then onboard it as a package. (farheen_att,
14:32:59)
- Priya proposes to add the sample data set
during the time of on-boarding. (farheen_att,
14:33:51)
- This represents the ground zero of provenance.
Modelers are making a claim of what they onboarding. There is not a
verify system. Keep it in mind. (farheen_att,
14:35:06)
- Sayee likes Priya's suggestion of on-boarding
data set during the time of on-boarding. (farheen_att,
14:36:08)
- We will continue to have a UI for create modify
delete. When we onboard a mode we can associate the data source
through the initial UI. Optional during time on onboarding and
mandatory during the time of publishing. (farheen_att,
14:37:30)
- Priya- Profile of 10,00 0 recoreds.
(farheen_att,
14:38:19)
- Sayee - You can write an SQL query.
(farheen_att,
14:39:11)
- Tausif - Create/modify/delete . How will we
migrate the old data to new? (farheen_att,
14:39:54)
- Sayee - a dataset is always associated with the
model. (farheen_att,
14:40:45)
- it is a many to many relationship. Once an
association is made can you change it later? (farheen_att,
14:41:37)
- yes, they have to be able to go and
edit. (farheen_att,
14:41:47)
- What if i change datasets related to one
model. (farheen_att,
14:42:04)
- I onboarded a model with dataset1. I want to
change that association on day 2 because I made a mistake? Then I
should have the ability to change. (farheen_att,
14:43:06)
- second case I have the same model with new
dataset then I have to onboard a new model. (farheen_att,
14:43:33)
- essentially its a definition of a dataset. Go
to ui where you define your dataset. You have a name, desc, define
data source. URI to where you can the data. Completion of this
task is a name 1. What if you change URI? (farheen_att,
14:46:34)
- call it name 2 (farheen_att,
14:46:45)
- what if you change the source of the
model? (farheen_att,
14:47:06)
- then you can not change the URI. You have to
create a new model. (farheen_att,
14:47:32)
- in summary of this topic. we are working on
associations and config manager. (farheen_att,
14:48:31)
- bryan can provide information of the cluster
and successful deployment. Anything beyond you need to
specify. (farheen_att,
14:49:33)
- managing lifecycle is something that needs to
be improved. (farheen_att,
14:50:16)
- more issues. Do we need to dig deeper on
couchdb? (farheen_att,
14:50:58)
- issue with K8? is it still an issue?
(farheen_att,
14:51:38)
- We have a helm chart that installs couchdb and
a part of system integration just like mariadb. (farheen_att,
14:52:07)
- Polymer? (farheen_att,
14:52:20)
- Problem it takes a long time to load components
due to dependencies. not scalable. (farheen_att,
14:53:53)
- We need to bundle the dependencies into one
file. Initially a little slow but the response time is much
faster. (farheen_att,
14:55:11)
- it's the number of files not the size of the
files. We are benefitting from the web component. Final product
can be a web component to drop in view components. (farheen_att,
14:57:22)
- it's a reliability concern. The files received
can stack a number of threats. Packaging as a single file if more
efficient. license compliant. (farheen_att,
14:58:22)
- back end is not effected. (farheen_att,
14:58:54)
- issue seen from ui team? (farheen_att,
14:59:05)
- Yes, issue with our existing CSS (farheen_att,
14:59:38)
- we will try to match the color scheme and set
up a call with you all. (farheen_att,
15:00:09)
- ACTION: Sayee set up
a call with Tausif, Farheen, and Vasu. (farheen_att,
15:01:06)
- any other performance improvements?
(farheen_att,
15:01:50)
- lazy loads the pipeline and ml worbench
(farheen_att,
15:02:24)
- are minimizing tools used for polymer single
file. (farheen_att,
15:02:53)
- the tools are not integrated into the polymer
process. Our focus is on the feature. (farheen_att,
15:03:29)
- any cacheing strategy? (farheen_att,
15:03:35)
- we get cache automatically on load. Server
side strategy with gzip when we serve the server type
strategy. (farheen_att,
15:04:11)
- brotli is better compression tool (farheen_att,
15:05:53)
- brotli is a part of server side compression.
you can see it in your request header view. (farheen_att,
15:06:26)
- cacheing can be optimized for the file
name. (farheen_att,
15:07:06)
- ACTION: Manoop add
the topics for ML workbench for the next call. (farheen_att,
15:08:14)
Meeting ended at 15:08:18 UTC
(full logs).
Action items
- Sayee will talk thru the gaps of the details around data sets and provenance.
- Sayee set up a call with Tausif, Farheen, and Vasu.
- Manoop add the topics for ML workbench for the next call.
People present (lines said)
- farheen_att (61)
- collabot` (3)
Generated by MeetBot 0.1.4.