Assisted Code Review with source{d} Lookout
Vadim Markovtsev
Machine Learning for
Large Scale Code Analysis
Plan
- Origins
- What is Lookout?
- SDK
- From scratch: typo correction
Many efforts target boring stuff
Automatable ≠ unattended 😔
When to help?
- While you type = IDE
- While you check = CI
- While you review = PR
- Periodically, asynchronously = cron
IDE
- Instant feedback
- Rich information
- Rich UI
- Instant feedback
- Many different IDEs and languages
CI
- Part of the workflow
- More time to run
- No UI
- Must not be wrong
- Longer feedback loop
PR reviews
- Part of the workflow
- More time to run
- UI
- Should not be wrong
- Longer feedback loop
cron
- Plenty of time to run
- Outside of the workflow
- No UI
- Longest feedback loop
Comparison
|
∈ workflow |
UI |
Feedback |
Rich info |
FP requirements |
IDE |
✓ |
✓✓ |
instant |
✓ |
low |
CI |
✓ |
✗ |
minutes |
✗ |
0 |
PR |
✓ |
✓ |
minutes |
✗ |
lowest |
cron |
✗ |
✗ |
hours |
✗ |
low |
When to help?
- While you type = IDE
- While you check = CI
- While you review = PR
- Periodically, asynchronously
Goals
- Assisted code review platform
- Tight git/GitHub integration
- Analyzed language agnostic
- Implementation language agnostic
- Batteries included
Architecture
Push event
Push event
Push event
Push event
Push event
Push event
Push event
PR event
PR event
docs.sourced.tech/lookout
src-d/lookout-sdk
- Single source of gRPC definitions
- Low-level API: Go, Python
- Low-level examples
src-d/lookout-sdk-ml
- High-level Python API
- Stateful analyzers
- Integrated with source{d} ml ecosystem
Rule of 👍
High-level API
class MyAnalyzer(Analyzer):
@classmethod
def train(cls, ...) -> AnalyzerModel:
# ...
def analyze(self, ...) -> [Comment]:
# do something with self.model
Train
@with_uasts_and_contents
def train(cls,
ptr: ReferencePointer,
config: Dict[str, Any],
data_service: DataService,
files: Iterable[File]
) -> AnalyzerModel:
Analyze
@with_changed_uasts_and_contents
def analyze(self,
ptr_from: ReferencePointer,
ptr_to: ReferencePointer,
data_service: DataService,
changes: Iterable[Change]
) -> [Comment]:
Behind the scenes
- gRPC servers and clients
- Pooling and threading
- Database of trained models
- Caches
- Logging
- Metrics
Objective
class Packfile:
Objective
def read_packfile(...):
# ...
class Packfile:
Objective
def read_packfile(...):
# ...
class Packfle:
Steps
- Install lookout-sdk-ml and autocorrect
- Write typos.py
- Fork a repo, create a PR
- Generate a new GitHub Personal Access Token
- Run our analyzer
Summary
- source{d} decided to build products for assisted code review
- Lookout is a great platform for assisted code review
- Writing new analyzers is easy and fun