Mobile CI/CD Testing

Current state and future vision for iOS & Android test automation

Test Coverage Today

๐ŸŽ iOS

  • โœ“ Unit Tests (Fastlane + XCTest)
  • โœ“ Snapshot Tests
  • โœ“ UI Tests (Manual trigger)
  • โœ“ ZHL UI Tests
  • โœ“ Draco UI Tests
  • โœ“ Listing Search UI Tests
  • โœ“ SonarQube Analysis

๐Ÿค– Android

  • โœ“ Unit Tests (Gradle)
  • โœ“ Snapshot Tests
  • โœ“ Lint Checks
  • โœ“ Emulator.wtf UI (Nightly)
  • โœ“ Manual MR UI Tests
  • โœ“ Code Coverage (Kover)

iOS vs Android: Line-by-Line

Metric ๐ŸŽ iOS ๐Ÿค– Android
Total Pipelines (30 days) 1,001 351
Overall Success Rate 48.5% 79.7%
Overall Failure Rate 51.4% 20.2%
Total Failures 491 70
Main Branch Success 26.2% 78.7%
Main Branch Failures 245 / 332 25 / 117
Release Branch Success 76.5% 80.0%
Primary Failure Source UI Tests Emulator.wtf
Top Failing Job test:ui:draco ui-tests:nightly
Coverage Tool llvm-cov Kover
Coverage Integration โœ“ SonarQube โœ“ SonarQube + Badge
Coverage Job In test:unit Dedicated job

What We're Doing Well

โšก

Fast Feedback

Unit tests run on every MR, providing immediate feedback to developers

๐Ÿ“Š

Coverage Tracking

Comprehensive coverage reports integrated with SonarQube for quality metrics

๐Ÿ”„

Parallel Execution

Tests run independently with no dependencies, maximizing pipeline speed

Areas for Improvement

๐ŸŽฏ

Manual UI Tests

iOS UI tests are manual-only on MRs, reducing confidence before merge

โฐ

Limited Android UI

Emulator.wtf runs nightly only, catching issues after they're merged

๐Ÿ”

Test Visibility

No unified dashboard for test results across both platforms

๐Ÿ“ˆ

Flakiness Tracking

No systematic approach to identifying and fixing flaky tests

Pipeline Success Rates

๐ŸŽ iOS

48.5%
Success Rate
1,001 TOTAL PIPELINES:
โœ“ 464 Success
โœ— 491 Failed
+ 46 other (canceled/manual/running)

๐Ÿค– Android

79.7%
Success Rate
351 TOTAL PIPELINES:
โœ“ 276 Success
โœ— 70 Failed
+ 5 other (running/skipped)

Main Branch Instability

๐ŸŽ

iOS Main Branch

73.7%
Failure Rate

245 failures out of 332 main pipelines

3 out of 4 main builds fail

๐Ÿค–

Android Main Branch

21.3%
Failure Rate

25 failures out of 117 main pipelines

Better but still concerning

Target: <5% main branch failure rate ยท Currently: 3.5x-14.7x over target

Where Pipelines Fail

๐ŸŽ iOS Top Failures

  • โœ— Main branch: 245 failures (50% of all)
  • โœ— fix-qa-1: 55 failures
  • โœ— ant/OptimizePipeline: 37 failures
  • โœ— fix-qa: 22 failures
  • โœ— Release branches: 4 failures (23.5%)
Root Cause: UI tests (test:ui:draco, test:ui:zhl)

๐Ÿค– Android Top Failures

  • โœ— Main branch: 25 failures (36% of all)
  • โœ— MR !1682: 9 failures (highly flaky)
  • โœ— MR !1662: 4 failures
  • โœ— Various MRs: 2-3 failures each
  • โœ— Release branches: 1 failure (20%)
Root Cause: Emulator.wtf nightly UI tests

Future Testing Strategy

Today

  • โ€ข UI tests manual or nightly
  • โ€ข Catch bugs post-merge
  • โ€ข Fragmented test results
  • โ€ข Manual flake investigation
โ†’

Tomorrow

  • โ€ข Automated UI tests on MRs
  • โ€ข Catch bugs pre-merge
  • โ€ข Unified test dashboard
  • โ€ข Automated flake detection

Next Steps

๐Ÿšจ

1. iOS Main Emergency

P0 IMMEDIATE: 73.7% main failure (245/332). Root cause: UI tests. War room needed. This blocks ALL development.

๐Ÿ”ง

2. Stabilize UI Tests

P0 THIS WEEK: Fix test:ui:draco and test:ui:zhl. Add retries, improve assertions, reduce flakiness. 245 failures traced to this.

๐ŸŽฏ

3. Android Main Health

P1 THIS SPRINT: Reduce 21.3% to <5%. Fix emulator.wtf nightly tests. Address MR !1682 flakiness (9 failures).

๐Ÿ“Š

4. Monitoring & Prevention

P1 NEXT SPRINT: Build flakiness dashboard. Track test health. Alert on degradation. Prevent future crises.

Shift Left on Quality

Catch issues earlier, ship with confidence