#acumos-meeting: Architecture Committee

Meeting started by farheen_att at 14:03:19 UTC (full logs).

Meeting summary

  1. ML Workbench - Sayee and team (farheen_att, 14:07:35)
    1. Sayee reviewing ML workbench. Using Angular. Advantage is easy to build a UI with plugin approach. Each module can run in a pod. Problem: today we deploy to single VM. Enterprises have their own set of pipelines. Solution is to make an admin who can configure their acumos instance called a config manager. (farheen_att, 14:14:01)
    2. storing meta data for model mapping in couchdb. Similar content will be mapped in config management. (farheen_att, 14:15:50)
    3. where do we have this manager. Is it an addendum to the existing Admin or stand alone config manager. (farheen_att, 14:17:10)
    4. needs to be discussed. support and guidelines for k8 is driven by what we co host on the platform. It takes a long time to launch a pipeline or notebook. Advantage with noSQL is we are integrating with couchdb. (farheen_att, 14:21:19)
    5. where will the config manager be? in the admin ui or as a tile in design studio. (farheen_att, 14:23:05)
    6. model association - config manager will be able to configure how the model is shared across two instances of Acumos through E5. (farheen_att, 14:25:31)
    7. linking is done through Nifi. The association is not stored in Nifi. We have to train the data scientist to use. Learning curve. (farheen_att, 14:29:16)
    8. data set is a data set of meta data. Such as attributes. (farheen_att, 14:29:53)
    9. where is the history? (farheen_att, 14:30:02)
    10. where is the record of the data association to the model (farheen_att, 14:30:35)
    11. Nifi gives you data as well as provenance. (farheen_att, 14:30:49)
    12. Does Nifi give you provenance out of the box? (farheen_att, 14:31:05)
    13. ACTION: Sayee will talk thru the gaps of the details around data sets and provenance. (farheen_att, 14:31:57)
    14. Priya - you have a data set id and it's physical location. As a part of onboarding I can specify the model and sample data set then onboard it as a package. (farheen_att, 14:32:59)
    15. Priya proposes to add the sample data set during the time of on-boarding. (farheen_att, 14:33:51)
    16. This represents the ground zero of provenance. Modelers are making a claim of what they onboarding. There is not a verify system. Keep it in mind. (farheen_att, 14:35:06)
    17. Sayee likes Priya's suggestion of on-boarding data set during the time of on-boarding. (farheen_att, 14:36:08)
    18. We will continue to have a UI for create modify delete. When we onboard a mode we can associate the data source through the initial UI. Optional during time on onboarding and mandatory during the time of publishing. (farheen_att, 14:37:30)
    19. Priya- Profile of 10,00 0 recoreds. (farheen_att, 14:38:19)
    20. Sayee - You can write an SQL query. (farheen_att, 14:39:11)
    21. Tausif - Create/modify/delete . How will we migrate the old data to new? (farheen_att, 14:39:54)
    22. Sayee - a dataset is always associated with the model. (farheen_att, 14:40:45)
    23. it is a many to many relationship. Once an association is made can you change it later? (farheen_att, 14:41:37)
    24. yes, they have to be able to go and edit. (farheen_att, 14:41:47)
    25. What if i change datasets related to one model. (farheen_att, 14:42:04)
    26. I onboarded a model with dataset1. I want to change that association on day 2 because I made a mistake? Then I should have the ability to change. (farheen_att, 14:43:06)
    27. second case I have the same model with new dataset then I have to onboard a new model. (farheen_att, 14:43:33)
    28. essentially its a definition of a dataset. Go to ui where you define your dataset. You have a name, desc, define data source. URI to where you can the data. Completion of this task is a name 1. What if you change URI? (farheen_att, 14:46:34)
    29. call it name 2 (farheen_att, 14:46:45)
    30. what if you change the source of the model? (farheen_att, 14:47:06)
    31. then you can not change the URI. You have to create a new model. (farheen_att, 14:47:32)
    32. in summary of this topic. we are working on associations and config manager. (farheen_att, 14:48:31)
    33. bryan can provide information of the cluster and successful deployment. Anything beyond you need to specify. (farheen_att, 14:49:33)
    34. managing lifecycle is something that needs to be improved. (farheen_att, 14:50:16)
    35. more issues. Do we need to dig deeper on couchdb? (farheen_att, 14:50:58)
    36. issue with K8? is it still an issue? (farheen_att, 14:51:38)
    37. We have a helm chart that installs couchdb and a part of system integration just like mariadb. (farheen_att, 14:52:07)
    38. Polymer? (farheen_att, 14:52:20)
    39. Problem it takes a long time to load components due to dependencies. not scalable. (farheen_att, 14:53:53)
    40. We need to bundle the dependencies into one file. Initially a little slow but the response time is much faster. (farheen_att, 14:55:11)
    41. it's the number of files not the size of the files. We are benefitting from the web component. Final product can be a web component to drop in view components. (farheen_att, 14:57:22)
    42. it's a reliability concern. The files received can stack a number of threats. Packaging as a single file if more efficient. license compliant. (farheen_att, 14:58:22)
    43. back end is not effected. (farheen_att, 14:58:54)
    44. issue seen from ui team? (farheen_att, 14:59:05)
    45. Yes, issue with our existing CSS (farheen_att, 14:59:38)
    46. we will try to match the color scheme and set up a call with you all. (farheen_att, 15:00:09)
    47. ACTION: Sayee set up a call with Tausif, Farheen, and Vasu. (farheen_att, 15:01:06)
    48. any other performance improvements? (farheen_att, 15:01:50)
    49. lazy loads the pipeline and ml worbench (farheen_att, 15:02:24)
    50. are minimizing tools used for polymer single file. (farheen_att, 15:02:53)
    51. the tools are not integrated into the polymer process. Our focus is on the feature. (farheen_att, 15:03:29)
    52. any cacheing strategy? (farheen_att, 15:03:35)
    53. we get cache automatically on load. Server side strategy with gzip when we serve the server type strategy. (farheen_att, 15:04:11)
    54. brotli is better compression tool (farheen_att, 15:05:53)
    55. brotli is a part of server side compression. you can see it in your request header view. (farheen_att, 15:06:26)
    56. cacheing can be optimized for the file name. (farheen_att, 15:07:06)
    57. ACTION: Manoop add the topics for ML workbench for the next call. (farheen_att, 15:08:14)


Meeting ended at 15:08:18 UTC (full logs).

Action items

  1. Sayee will talk thru the gaps of the details around data sets and provenance.
  2. Sayee set up a call with Tausif, Farheen, and Vasu.
  3. Manoop add the topics for ML workbench for the next call.


People present (lines said)

  1. farheen_att (61)
  2. collabot` (3)


Generated by MeetBot 0.1.4.