Skip to main content

System Status: 

Data Science & Machine Learning Platform

Learn about student-focused CPU/GPU resources for data science, machine learning, and interactive programming

UC San Diego's Data Science/Machine Learning Platform (DSMLP) provides undergraduate and graduate students with access to research-class CPU/GPU resources for coursework, formal independent study, and student projects.

Vendor logos: Jupyter, Kubernetes, Docker, Nvidia, Tensorflow, PyTorchVisit https://datahub.ucsd.edu to sign on to DSMLP.

Built and operated by IT Services (ITS), with additional financial contributions from Cognitive Science and Jacobs School of Engineering, DSMLP leverages Qualcomm Institute's current research into cost-effective machine-learning cyberinfrastructure using Kubernetes and Docker container technologies.

Jupyter Notebooks:

Web-based Jupyter notebooks allow students to combine live code, equations, visualizations and narrative text for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and more.  DSMLP's Jupyter notebooks offer straightforward interactive access to popular languages and GPU-enabled frameworks such as Python, R, Pandas, PyTorch, TensorFlow, Keras, NLTK, and AllenNLP.

Screen shot of a Jupyter notebook illustrating embedded graphs (low resolution/detail)

Machine Learning:

Complex ML workflows are supported through terminal/SSH logins, background batch jobs, and a full Linux/Ubuntu CUDA development suite.  Users may install additional library packages (e.g. conda/pip, CRAN) as needed, or can opt to replace the default environment entirely by launching their own custom Docker containers.

High-speed cluster-local storage houses student workspaces, course files, and common training corpora (e.g. CIFAR, ImageNet).

Instructor Support:

ITS/Educational Technology Services (ETS) consultants are available to assist with course-related software and container configuration, assignment distribution/collection, and staging of course-specific datasets.

 

Request DSMLP for your course

To request a Datahub/DSMLP or AWS Educate course for undergraduate and graduate-level courses, please submit a SIC Course Request Form.

Eligibility & Requesting Access

DSMLP Prerequisites: Basic Linux & SSH Proficiency

Independent non-course use of DSMLP presumes basic technical proficiency with Linux and SSH/command-line tools. Those unfamiliar with such usage should take advantage of online tutorials such as DigitalOcean's SSH Essentials or campus-based Carpentries workshops.

Students:

Students may obtain access to DSMLP through the following routes:

  1. By automatic setup of courses assigned to DSMLP: enrolled students receive access shortly before the first day of each term.
  2. Via the DSMLP Independent Study Access Request Form
  3. Upon request, for students interested in exploring the environment for personal enrichment:  please send email to datahub@ucsd.edu.

Faculty & Staff

Instructors and instructional support staff may request personal DSMLP access in order to support their students' use of the platform, or to explore and evaluate the environment for course development purposes.

To request an account, please email request to datahub@ucsd.edu. To request a Datahub/DSMLP or AWS Educate course for your undergraduate and graduate-level courses, submit a SIC Course Request Form.

Please note that DSMLP is limited to student-focused activities. Research IT Services can help identify comparable GPU compute environments for your non-student research needs.

Professional Programs:

Due to funding restrictions, DSMLP resources are not currently available to self-supported professional degree programs' coursework or capstone projects.

Student and Instructor Documentation

Detailed documentation for both students and instructors is available on the Service Portal Knowledge Base - Specialized Instructional Computing.

Independent Study, Student Research, and Student Projects

In support of UC San Diego's instructional mission, DSMLP resources are made available for many independent-study and student research activities, including:

  • Designated independent study/research courses (e.g. 198/199, 293/298/299)
  • Independent thesis/dissertation research*
  • State-supported capstone projects
  • Campus-sponsored co-curricular activities (e.g. projects, workshops, clubs, teams)

Please use the DSMLP Independent Study Access Request Form to outline the scope of your project and describe its computational needs. If you are requesting resources as part of a team, only one member need submit a form.

Due to state funding restrictions, DSMLP cannot be used for self-supported professional degree programs' coursework or capstone projects.

*Please see the next section, "Special considerations for Doctoral research", for additional information regarding use of DSMLP for dissertation research.

Research Computing (Non-course/Non-credit)

Research IT Services offers the UCSD Research Cluster to provide faculty, staff, Postdoc and student researchers with CPU/GPU compute resources for projects that are non-instructional and data intensive.  The Research Cluster resources are made available for research activities, including:

- UC San Diego affiliated research

- Independent student research (e.g. non-course/non-credit related activities)

Please use the Research Cluster Request Form to outline the scope of your project and describe its computational needs. If you are requesting resources as part of a team, only one member need submit a form. For information about alternative computing services, please email research-it@ucsd.edu.
To report problems with DSMLP, or to request assistance, please email datahub@ucsd.edu.
(Students' course-specific questions are best directed to the instructor, TA, or tutors.)