k6 Load Testing Tutorial: Complete 2026 Guide

k6 load testing tutorial for 2026 — install, scenarios, thresholds, auth, Grafana Cloud and CI/CD, with a real engineer's playbook and pros & cons.

Last updated: June 29, 2026 · 12 min read · By Avinash Kamble, reviewed by Priyanka G.

The first time I ran a serious k6 load test in anger, it was 11 PM the night before a Black Friday freeze at a payments client. The team was convinced their checkout API could hold 2,000 RPS. k6 took twelve minutes to prove it folded at 740 — a single misconfigured DB connection pool. We shipped the fix at 2 AM, slept three hours, and the launch held. That's the kind of confidence a good load test buys you, and this guide is the playbook I wish I'd had that night.

This k6 load testing tutorial takes you from zero install to production-grade scenarios, thresholds, auth flows, Grafana Cloud, and CI/CD wiring. Pair it with our JMeter Tutorial, k6 vs JMeter Comparison, and Load Testing Tools Comparison.

Key takeaways
k6 is JavaScript-scripted, Go-powered, and built for engineers — not GUIs.
Scenarios + thresholds let you encode a release SLO directly into the test.
One k6 instance comfortably drives 30k+ RPS; Grafana Cloud handles the rest.
The biggest wins come from CI gating with thresholds, not one-off runs.
Most teams fail load tests because of test data and ramp-up, not the tool.

What is k6?

k6 is an open-source load testing tool built by Grafana Labs. It scripts in JavaScript (ES2015+) and executes in a Go runtime, which is why a single mid-range laptop can sustain 30k+ requests per second — roughly 15–30× what an equivalent JMeter setup pushes. The official docs live at grafana.com/docs/k6, and the open spec it implements aligns with the W3C Web Performance Working Group metrics most observability stacks consume.

You'll see k6 used for three things in 2026: pre-release load gates in CI, capacity planning ahead of marketing pushes, and SRE-driven chaos and soak testing. If you're new to API-layer testing in general, skim our API Testing Tutorial first — k6 assumes you can already reason about endpoints, auth, and status codes.

Step 1 — Install k6

macOS

brew install k6

Windows

winget install k6

Linux

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A
echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6

Verify:

k6 version
# k6 v0.50.0 (2026)

Pro tip — on CI runners, skip the OS install entirely and use the official Docker image (grafana/k6). It's ~40 MB, pins your k6 version per pipeline, and removes a class of "works on my laptop" Slack threads. I lost half a day once chasing a threshold regression that turned out to be a k6 minor-version bump on a self-hosted runner.

Step 2 — Your First k6 Test

Create load-test.js:

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  vus: 10,           // 10 virtual users
  duration: '30s',   // run for 30 seconds
};

export default function () {
  const res = http.get('https://api.example.com/users');

  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  sleep(1);  // simulate user pacing
}

Run:

k6 run load-test.js

You'll see real-time metrics as k6 runs. The two numbers that matter on the first read are http_req_duration p(95) (tail latency) and http_req_failed (error rate). Everything else is supporting evidence.

Step 3 — Scenarios

Scenarios let you model complex user patterns. Most production tests combine a baseline smoke check with a ramping load profile so the runtime mirrors a real launch:

export const options = {
  scenarios: {
    smoke: {
      executor: 'constant-vus',
      vus: 1,
      duration: '1m',
    },
    load: {
      executor: 'ramping-vus',
      stages: [
        { duration: '2m', target: 50 },
        { duration: '5m', target: 50 },
        { duration: '2m', target: 0 },
      ],
    },
  },
};

k6 executors — which one when

Picking the wrong executor is the #1 reason load tests lie. This is the cheat-sheet I keep pinned in our team wiki:

Executor	What it models	Use when	Avoid when
`constant-vus`	Fixed concurrency	Smoke + soak tests	You need a fixed throughput
`ramping-vus`	VU count ramps up/down	Standard load + stress	Backend has aggressive autoscaling — RPS drifts
`constant-arrival-rate`	Fixed RPS, VUs auto-allocated	SLO-style "hold 500 RPS" tests	Endpoints have wildly varying latency
`ramping-arrival-rate`	Ramps requests per second	Spike, flash-sale, Black Friday models	You can't pre-allocate enough VUs
`per-vu-iterations`	N iterations per VU	Deterministic, data-driven runs	You care about wall-clock duration
`shared-iterations`	N iterations shared across VUs	Fixed workload (e.g. 10k orders)	You need consistent concurrency

Step 4 — Thresholds

Set pass/fail criteria:

export const options = {
  thresholds: {
    http_req_duration: ['p(95)<500'],   // 95th percentile < 500ms
    http_req_failed: ['rate<0.01'],      // error rate < 1%
    http_reqs: ['count>1000'],          // at least 1000 requests
  },
};

If thresholds fail, k6 exits with non-zero status — perfect for CI/CD. See our CI/CD Pipeline Testing Tutorial. The trick most teams miss: write thresholds per scenario with tagged metrics (e.g. http_req_duration{scenario:checkout}) so a slow analytics call doesn't mask a fast checkout SLO breach.

Step 5 — Checks vs Assertions

// Check (continues test, tracks pass/fail rate)
check(res, {
  'status is 200': (r) => r.status === 200,
});

// Assertion (stops test on failure)
import { fail } from 'k6';
if (res.status !== 200) fail(`Expected 200, got ${res.status}`);

Use checks for monitoring; assertions for critical failures. In CI I almost always prefer checks + thresholds — a single 502 shouldn't abort a 30-minute soak run, but a 5% sustained error rate absolutely should fail the build.

Step 6 — Test Data

import { SharedArray } from 'k6/data';

const users = new SharedArray('users', function () {
  return JSON.parse(open('./users.json'));
});

export default function () {
  const user = users[__VU % users.length];
  http.post('https://api.example.com/login', JSON.stringify(user));
}

Share data across VUs efficiently — one copy in memory, not one per VU. For 50k seed users that's the difference between a 200 MB and a 6 GB process. If your test data is sensitive, generate it on the fly with Faker inside setup() instead of shipping a JSON file alongside the test.

Step 7 — Multiple Endpoints

import http from 'k6/http';

export default function () {
  // 50% of requests to /users
  if (Math.random() < 0.5) {
    http.get('https://api.example.com/users');
  } else {
    // 50% to /products
    http.get('https://api.example.com/products');
  }
}

For realistic traffic, weight requests by what your production logs actually show — 70/20/10 reads/writes/searches is closer to most SaaS apps than an even split.

Step 8 — Authentication

export function setup() {
  const loginRes = http.post('https://api.example.com/auth/login', {
    email: 'admin@example.com',
    password: 'Sup3rSecret!',
  });
  return { token: loginRes.json('token') };
}

export default function (data) {
  http.get('https://api.example.com/admin/users', {
    headers: { Authorization: `Bearer ${data.token}` },
  });
}

For deeper API patterns, see our API Testing Tutorial. If you're load-testing OAuth flows, mint tokens in setup() and reuse them — hammering your identity provider with 5k logins per minute almost always gets you rate-limited before the system-under-test does.

Step 9 — k6 Cloud (Grafana Cloud)

k6 cloud load-test.js

This runs in Grafana Cloud with distributed load generation across multiple regions, real-time dashboards, threshold-based alerts, and long-term trend analysis. The free tier covers most pre-release smoke runs; paid plans unlock multi-region and longer soak windows.

Step 10 — CI/CD Integration

GitHub Actions

name: k6 Load Test
on: [pull_request]
jobs:
  load:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run k6
        uses: grafana/k6-action@v0.3.0
        with:
          filename: tests/load/smoke.js

Docker

docker run --rm -v $PWD:/scripts grafana/k6 run /scripts/load-test.js

See our GitHub Actions for Automation Testing guide for full pipeline patterns. For PR gating, run a tight 60-second smoke on every PR and the full ramp-up scenario on nightly main — most teams blow up their CI minutes by running 30-minute soaks per push.

Advanced k6 Patterns

Modular test scripts

// helpers/auth.js
export function login() {
    const res = http.post('https://api.example.com/auth/login',
        JSON.stringify({ email: 'admin@example.com', password: 'Sup3rSecret!' }));
    return res.json('token');
}

// load-test.js
import { login } from './helpers/auth.js';

export default function () {
    const token = login();
    http.get('https://api.example.com/admin/users', {
        headers: { Authorization: `Bearer ${token}` }
    });
}

Custom metrics

import { Trend, Counter } from 'k6/metrics';

const checkoutDuration = new Trend('checkout_duration');
const errorCount = new Counter('errors');

export default function () {
    const start = Date.now();
    const res = http.post('https://api.example.com/checkout', payload);
    checkoutDuration.add(Date.now() - start);
    if (res.status !== 200) errorCount.add(1);
}

Multiple scenarios in parallel

export const options = {
    scenarios: {
        browse: {
            executor: 'constant-vus',
            vus: 100,
            duration: '10m',
            exec: 'browse',
        },
        buy: {
            executor: 'ramping-arrival-rate',
            startRate: 1,
            timeUnit: '1s',
            preAllocatedVUs: 10,
            stages: [{ duration: '10m', target: 10 }],
            exec: 'buy',
        },
    },
};

Stress testing

export const options = {
    stages: [
        { duration: '2m', target: 100 },
        { duration: '5m', target: 100 },
        { duration: '2m', target: 200 },
        { duration: '5m', target: 200 },
        { duration: '2m', target: 300 },
        { duration: '5m', target: 300 },
        { duration: '2m', target: 0 },
    ],
};

Soak testing

export const options = {
    stages: [
        { duration: '5m', target: 100 },
        { duration: '4h', target: 100 },
        { duration: '5m', target: 0 },
    ],
    thresholds: {
        'http_req_duration{scenario:soak}': ['p(99)<500'],
    },
};

k6 Pros and Cons (from real projects)

I've shipped k6 on three production stacks — a payments API, a high-traffic media CMS, and an internal HR platform. Here's the honest scorecard.

Pros	Cons
JavaScript scripting — every engineer on the team can read tests	Single-machine memory ceiling (~5k VUs) without Grafana Cloud
Native CLI exit codes — drops into CI with zero glue	No browser-level rendering (use k6 browser module — still chromium-only)
Excellent Grafana / Prometheus integration out of the box	ES module ecosystem is limited vs. plain Node — no `fs`, no native npm packages
Tagged metrics + threshold-per-tag are best-in-class	Distributed runs require Grafana Cloud or DIY k8s operator
Active OSS community, weekly releases	Steeper ramp than JMeter for non-coders

If you're choosing between tools, our k6 vs JMeter comparison walks through the decision matrix with team-size and budget cuts.

Best Practices

Do

Use scenarios for realistic user patterns
Use thresholds for pass/fail criteria, tagged per scenario
Use checks for monitoring, assertions for hard stops
Use SharedArray for test data
Run in CLI mode for local dev, Docker for CI
Use Grafana Cloud for production-scale, multi-region testing

Don't

Don't run heavy load from your laptop
Don't add custom timers that fake performance
Don't test against production unless you've coordinated with SRE
Don't skip the ramp-up (test in real conditions)
Don't ignore failed thresholds — fail the build

Common Mistakes

1 — No ramp-up

// BAD — sudden spike
vus: 1000,
duration: '1m',

// GOOD — gradual ramp-up
stages: [
  { duration: '1m', target: 100 },
  { duration: '5m', target: 100 },
  { duration: '1m', target: 0 },
],

2 — Testing in dev environment

Test in staging that mirrors production. Dev environments are too small to test realistic load and you'll chase ghosts for a week.

3 — Ignoring think time

Add sleep() to simulate user pacing. Real users don't hammer APIs.

4 — Testing the wrong thing

Focus on critical user journeys, not every endpoint. For broader context, see our Load Testing Tools Comparison.

5 — Trusting a single run

Network jitter, autoscaling cold starts, and noisy neighbors all skew one-shot results. Run the same scenario three times and treat the median as truth.

Frequently asked questions

Is k6 better than JMeter?

For developer-led load testing, k6 is faster (15–30×), easier to script in JavaScript, and has better CI/CD integration. JMeter still wins for broader protocol coverage (JDBC, JMS, FTP) and GUI-driven test design for non-coders.

How long does it take to learn k6?

For an experienced developer comfortable with JavaScript: 1–2 weeks to ship a production-grade scenario. For a manual tester new to scripting: 4–6 weeks with daily practice.

How many concurrent users can k6 simulate?

A single k6 instance comfortably drives 30k+ RPS and 5k VUs on a modern 8-core machine. For 100k+ VUs or multi-region load, use Grafana Cloud distributed execution.

Is k6 free?

Yes — k6 is free and open source under the AGPL-3.0 license. k6 Cloud (Grafana Cloud) has a free tier and paid plans for distributed execution, longer soak windows, and long-term dashboards.

Can k6 test APIs, WebSockets and gRPC?

Yes — k6 has first-class support for HTTP/1.1, HTTP/2, WebSockets, and gRPC, and a browser module for chromium-based UI flows. It's a strong fit for API and microservice load testing.

What's the difference between k6 and Gatling?

k6 uses JavaScript and a Go runtime; Gatling uses Scala/Java/Kotlin and a JVM runtime. Both push huge throughput from a single node. Pick k6 if your team writes JS daily; pick Gatling if you're a JVM shop with existing Maven/Gradle pipelines.

Should I run k6 against production?

Only with explicit SRE sign-off, a kill-switch, and a tight blast radius (low VUs, idempotent endpoints, off-peak window). For most teams, a production-shaped staging environment is the safer default.