======================================= #acumos-meeting: Architecture Committee ======================================= Meeting started by farheen at 14:15:35 UTC. The full logs are available at http://ircbot.wl.linuxfoundation.org/meetings/acumos-meeting/2018/acumos-meeting.2018-10-25-14.15.log.html . Meeting summary --------------- * E2E Project Requirements (farheen, 14:15:51) * Anwar: We have to have an understanding of Kubeflow will insure that we have the maximum re-usability. (farheen, 14:16:33) * Manoop: Data pipeline does have a high level pipeline from there the developer would look at the already existing components that we need to integrate. (farheen, 14:17:17) * First start of tech document is finding re-usable existing components. (farheen, 14:17:53) * Pantellis: I need help from each of the projects impacted CMLP is one that you mentioned. That would be helpful to me. (farheen, 14:18:30) * Anwar: Yes, we'll have a one day lockdown. (farheen, 14:19:04) * Adi: I don't understand all the pipelines. It is a different framework. The pipeline effort is to take any flows, topics, publishers, to create a scoring pipeline and any type of pipeline that they want. (farheen, 14:19:58) * Pantellis: It is symantecs. (farheen, 14:20:13) * Adi: If that is the case then we can break it up into 4 but to me the underlying framework is what is allowing me to create these different types of pipelines. (farheen, 14:20:46) * Pantellis: OK, so you're saying the CMLP project already has a pipeline implemented? (farheen, 14:21:16) * Adi: We have kafka, flink, it is not ready. My goal was to have the pipelines completely API driven. (farheen, 14:22:07) * Adi: Even airflow provides a capability. Now they give you an api. (farheen, 14:22:34) * Pantellis: Compared to ARGO do you know how it is compared to Kubernetes? (farheen, 14:23:02) * Adi: I am not an expert on airflow. (farheen, 14:23:12) * Pantellis: Perhaps Adi and I can get together. (farheen, 14:23:42) * Adi: Yes, (farheen, 14:23:49) * ACTION: Pantellis and Adi get together to further discuss. (farheen, 14:24:23) * Anwar: Pantellis set up a lockdown with onboarding, Design Studio, CMLP. (farheen, 14:24:44) * Pantellis: We will kick it off to a workshop setting. (farheen, 14:25:05) * ACTION: : Pantellis set up a lockdown. (farheen, 14:25:15) * Jessica: Please anticipate that our part of Federated learning is to be in. Be sure that the E2E pipeline can accomodate it. (farheen, 14:25:54) * Jessica: We are coming late and trying to catch up. (farheen, 14:26:08) * Pantellis: Yes, I will include you in the federated learning. (farheen, 14:26:31) * Can you please include us? (farheen, 14:26:57) * Pantellis: What's a lockdown? (farheen, 14:27:07) * Anwar: It's an uninterrupted meeting that lasts a whole or half day. (farheen, 14:27:26) * Manoop: Before these release starts we plan for a 3 day workshop and ask each PTL to propose what will be in the technical architecture. (farheen, 14:28:17) * ACTION: Farheen Bring it up on the TSC call on Monday about the epics and exactly what will be the benefits. (farheen, 14:31:46) * AI/ML Target State Solution View (farheen, 14:33:09) * Adi: Before getting into ML workbench. (farheen, 14:33:57) * When you deal with ML you have to go through std patterns. (farheen, 14:34:18) * You have system of records and cook data sets. (farheen, 14:34:47) * They are going to give you data in raw or cooked form and put in some sort of catalog. (farheen, 14:35:05) * my main point is it's a lot of work. (farheen, 14:35:17) * If you're doing ML in this ecosystem managing these 12 tasks is a real problem. (farheen, 14:35:35) * Decisions are being taking automatically and models are entering and leaving the ecosystem. It an organizational idea. (farheen, 14:36:29) * you get this out of the box. (farheen, 14:36:37) * when i look at cmlp or acumos i don't see the organizational construct. (farheen, 14:36:54) * It changes the way you collaborate. (farheen, 14:37:04) * Anwar: What we saw in kubeflow we saw management of training the data sets, (farheen, 14:37:50) * Adi: Kubeflow is one way to do a lot of these things. (farheen, 14:38:02) * Data is one of those things you have to be careful about. How do you manage and organize access to it. (farheen, 14:38:28) * Pantellis: Giving kubeflow is an extension on top of kubernetes we can manage by kubernetes. (farheen, 14:39:00) * Adi: I'm looking at it from a distributed platform. This management is not organized. (farheen, 14:39:48) * Good questions I'm setting the context for the workbench because it will constantly evolve. They need to be managed in a workbench. (farheen, 14:40:28) * Kazi: Can we go thru 1 - 12. (farheen, 14:40:43) * Adi: Multiple data lakes dbases, flat files, any large ML system has to deal with many types of data. (farheen, 14:41:18) * Adi: 3. Data libraries. Warehouses where you cook the data create schemas and then 4. Catatlog (farheen, 14:41:51) * 5. I'm going to give it to somebody through a pipeline. 6. ML picks up data 7. Training then puts it in the catalog. (farheen, 14:42:29) * 8. you deploy into Kubernetes and where do you get your data from? (farheen, 14:42:52) * 9. Run time. 11. When you build and run your model it is consuming your application is number 11. (farheen, 14:44:22) * Adi: This is just a context. (farheen, 14:44:46) * Adi: If you go to Google AWS IBM clouds the first thing you do is create a workbench. An organizing construct. (farheen, 14:45:41) * Bryan: I think that Center data pipeline your input would be helpful. Where-ever you see CMLP is where I see CMLP and Acumos. (farheen, 14:46:50) * ACTION: Adi: Update slide remove CMLP and replace with AI Acumos. (farheen, 14:47:14) * Adi demonstrating CMLP workbench. 1. you create a project. You can give them an option for storage requirements. (farheen, 14:48:12) * You can have assets and do your ML in Notebooks, etc.. Each entry is going to have an entry over here. A project is made of a set of assets. If I start competing with models they have to be associated with projects that have to work in run time. (farheen, 14:49:38) * It's an organizational construct for assets. (farheen, 14:49:58) * you can create a notebook and go and start adding assets to that. (farheen, 14:50:17) * Do you give equal rights for data constructs? (farheen, 14:50:34) * Yes, you can use flink kafka you can start your ETL flows. (farheen, 14:51:00) * Anand: So creator can give access rights? (farheen, 14:51:14) * yes, (farheen, 14:51:18) * Adi: High Level view screenshot. Cooked data sets and data lakes. Workbench is a massive set of assets. If you don't provide a workbench you have to provide management. How do you track the model? E2E view of what an ML workbench. (farheen, 14:53:05) * Anwar: Want to go through PTLs and their plans around architecture. (farheen, 14:56:31) * Ken: We are finished with testing and ask to have the release built for all the components. The release B the test team is asking when the User stories will be ready. (farheen, 14:57:12) * Anwar: no architecture impacts. (farheen, 14:57:26) * Mukesh: We are creating the epics. (farheen, 14:57:41) * Mukesh: Focus is on Boreas. (farheen, 14:58:01) * Percentage complete? In terms of scoping the stories we are around 35 - 40%. (farheen, 14:58:41) * Phillippe: Regarding Boreas we studied ONNX format to onboard models. And then we will certainly have some impact coming from the training project. I have to coordinate with Pantellis to discuss the impact of training project. (farheen, 14:59:49) * : Chris: So far the requirements I've reviewed do not require architecture changes. As far as Athena we are fixing a bug for the maintenance release. (farheen, 15:00:34) * : We have put the code on the branch and documentation is ready. For Boreas there are a lot of impact with the pipelines and putting the jupyter into architecture. We are between 10 - 20%. (farheen, 15:01:33) * Deployment: Still preparing to do the jira items. For Boreas we have to depricate the existing. Documentation is left for the Athena. For Boreas i wouldn't say any out-lining has to take place. (farheen, 15:02:49) * ACTION: Anwar reach out to PTLs to get the remainder of the components. (farheen, 15:03:45) * Michelle: From licensing we are getting good participation from Orange. Etc. We are meeting twice a week. (farheen, 15:04:17) Meeting ended at 15:04:29 UTC. Action items, by person ----------------------- * farheen * Farheen Bring it up on the TSC call on Monday about the epics and exactly what will be the benefits. * **UNASSIGNED** * Pantellis and Adi get together to further discuss. * : Pantellis set up a lockdown. * Adi: Update slide remove CMLP and replace with AI Acumos. * Anwar reach out to PTLs to get the remainder of the components. People present (lines said) --------------------------- * farheen (78) * collabot` (3) Generated by `MeetBot`_ 0.1.4