Tutorials on Benchmark

Learn about Benchmark from fellow newline community members!

  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL
  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL
NEW

How Good is Good Enough: A Guide to Common LLM Benchmarks

In our last article, we talked about benchmarking as the highest level method of assessing the performance of LLMs. Today, we’re going to be looking in more detail at some of the most popular benchmarks, what they measure, and how they measure it. Note that most of the benchmarks listed below will have leaderboards and questions sets available somewhere public facing if you want to dive deeper, I’ve also included links to papers where appropriate. Let’s dive in!
Thumbnail Image of Tutorial How Good is Good Enough: A Guide to Common LLM Benchmarks

Benchmarking a Go and chi RESTful API

The amount of time and effort a developer dedicates towards writing a function depends on the details they choose to focus on: coding conventions, structure, programming style, etc. Suppose a group of developers is presented a high-level prompt to write the same function: given some input, return some output. For example, given a list of numbers, return a sorted list of numbers. The actual implementation of the function is left entirely to the discretion of the developer. A quick, mathematical way to evaluate each developer's implementation of this function, without any additional code, is by its time complexity . Particularly, knowing each implementation's Big-O complexity tells us how it might perform in the worst case scenario, commonly when the size of the input is very large. However, time complexity fails to account for the hardware the function is executed upon, and it does not provide any tangible, quantifiable metrics to base decisions on. Metrics such as operation speed and total execution time assign real numerical values to the performance of a function. By adding benchmarks , developers can leverage these metrics to better inform them on how to improve their code. The Go programming language has a benchmarking utility in its built-in, standard library package testing . To benchmark code in Go, define a function with a name prefixed with Benchmark (followed by a capitalized segment of text) and accepts an argument of struct type B , which contains methods and values for determining the number of iterations to run, running multiple benchmarks in parallel, timing execution times, etc. Example :
Thumbnail Image of Tutorial Benchmarking a Go and chi RESTful API

I got a job offer, thanks in a big part to your teaching. They sent a test as part of the interview process, and this was a huge help to implement my own Node server.

This has been a really good investment!

Advance your career with newline Pro.

Only $40 per month for unlimited access to over 60+ books, guides and courses!

Learn More