# Distributed Systems
## Common pattern in distributed systems research paper
1. Define the problem
2. Implement the solution
3. Test the solution
## Challenges in this field
- The overhead
- Cold-start of containers - solution: bypass the container, use sandbox
instead
- Initialization of workers/executor
- Better [[scheduling]]
- Exploiting data locality
## Two types of papers
Each year, thousands of papers in distributed system are published. They are
categorized into two types:
### Prototyping
Most of the papers are in this category. It's okay if nobody's using it. The key
point is:
- Is your problem really a problem?
- Do your ideas work out?
### Production
Only a few papers are in this category. Certain projects composed papers only
after they have gained a solid user base. These papers usually get accepted
"automatically", as their value has already been well proved.
Note that, these production systems are extremely challenging to be
[[maintenance|maintained]]!
- [[parsl|Parsl]] employed full-time developers to implement the system.
- [[spark|Apache Spark]] started as a prototyping project, and their paper got
rejected at the beginning. They then employed 3 engineers to refactor the
whole system to make it production.
- [[cctools|ND CCTools]] has senior software engineer Ben Tovar to help maintain
the project.
- [[tensorflow|TensorFlow]] has 300 software engineers worked 2 years on it.
Funding matters! No fund, no engineers.