Skip to main content
Version: v1.0.0

Open Science Grid (OSG)

About OSG

The Open Science Grid (OSG) is a distributed computing infrastructure that provides researchers with access to a vast pool of computing resources. It is an effort to enable scientists to perform large-scale data analysis and simulations across multiple partnered institutions and research facilities. OSG is designed to support a wide range of scientific disciplines such as physics, chemistry and climate science.

The main focus of OSG is to enable distributed High Throughput Computing (HTC) by providing a platform for researchers to run their applications.

High Performance Computing (HPC) vs High Throughput Computing (HTC)

HPCHTC
Focuses on maximizing the performance of a single job or a small number of jobs.Focuses on maximizing the total throughput of many jobs, often with a large number of independent tasks.
Typically used for applications that require significant computational resources and have long runtimes.Typically used for applications that can be broken down into many smaller tasks that can be executed in parallel.

Some Problems that are suitable for HTC and OSG:

HTC is well-suited for workloads that can be decomposed into many independent tasks executed concurrently. Common use cases include:

  • Large-Scale Data Analysis: Processing and extracting insights from datasets generated by experiments, observations, or simulations
  • Simulation Workflows: Running computationally intensive simulations across physics, chemistry, biology, and climate science, where many independent simulation instances can execute in parallel.- Parameter Sweeps: OSG can be used for performing parameter sweeps, where researchers need to run multiple instances of their applications with different parameters to explore the parameter space.
  • Parameter Sweeps & Optimization: Systematically exploring parameter spaces by executing large numbers of independent runs. I.e. running the same application with different parameters to find optimal configurations or to understand the sensitivity of results to parameter changes.
  • Machine learning and AI executed with multiple independent training tasks, different parameters, and/or data subsets.

Why to Use OSG?

  • Scalability: OSG allows researchers to scale their computations across a large number of resources, enabling them to tackle complex problems that require significant computational power.
  • Collaboration: OSG fosters collaboration among researchers by providing a shared infrastructure that can be accessed by multiple institutions. This promotes interdisciplinary research and allows scientists to work together on large projects.
  • Cost-effectiveness: By utilizing OSG, researchers can access computing resources without the need for significant upfront investments in hardware. This makes it a cost-effective solution for many research projects.
  • Flexibility: OSG supports a wide range of applications and workflows, allowing researchers to use the tools and software that best suit their needs. This flexibility makes it an attractive option for

How to Apply to use OSG?

To apply for access to OSG, please follow the following steps:

  1. Sign up for an account via the OSG Portal.
  2. After signing up, you will need to request access to the OSG resources. This typically involves filling out a form that describes your research project and how you plan to use the OSG resources.
  3. Once your application is reviewed and approved, you will receive access to the OSG resources. You may need to set up your environment and install any necessary software to start using OSG for your research.

Where to Learn More about OSG?

Below is a nice table from the OSG website that showcases what kind of work benefits from OSG.

Ideal Jobs!Still very advantageousMaybe not, but get in touch!
Expected Throughput, per user1000s concurrent cores100s concurrent coresDiscuss with OSG
CPU1 per job< 12 per job> 12 per job
GPU1 per job< 4 per job> 4 per job
Walltime< 10 hrs*< 20 hrs*> 20 hrs
RAM< few GB< 40 GB> 40 GB
Input< 10 GB< 40 GB> 40 GB**
Output< 10 GB< 40 GB> 40 GB**
Softwarepre-compiled binaries, containersMost other than →Licensed Software, non-Linux

** Courtesy of OSG Website.