Skip to content

Instantly share code, notes, and snippets.

View zbjornson's full-sized avatar

Zach Bjornson zbjornson

View GitHub Profile
#include <stdint.h>
typedef unsigned char U8;
typedef unsigned short U16;
typedef unsigned int U32;
typedef unsigned long long U64;
typedef intptr_t SINTa;
struct KernelState
{
@rygorous
rygorous / gist:f5e11f6589088f42dddd2711582afb1e
Last active June 19, 2021 21:06
Single-precision FMA using double-precision mul/add ops
We agreed beforehand that we don't care about tininess-before-or-after-rounding shenanigans. :)
With that out of the way, the idea is to do something like this:
fma(af, bf, cf):
# All arithmetic done with RN
ad = float_to_double(af)
bd = float_to_double(bf)
cd = float_to_double(cf)
@nadavrot
nadavrot / Matrix.md
Last active April 26, 2024 08:28
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of

@shafik
shafik / WhatIsStrictAliasingAndWhyDoWeCare.md
Last active April 24, 2024 03:49
What is Strict Aliasing and Why do we Care?

What is the Strict Aliasing Rule and Why do we care?

(OR Type Punning, Undefined Behavior and Alignment, Oh My!)

What is strict aliasing? First we will describe what is aliasing and then we can learn what being strict about it means.

In C and C++ aliasing has to do with what expression types we are allowed to access stored values through. In both C and C++ the standard specifies which expression types are allowed to alias which types. The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule. If we attempt to access a value using a type not allowed it is classified as undefined behavior(UB). Once we have undefined behavior all bets are off, the results of our program are no longer reliable.

Unfortunately with strict aliasing violations, we will often obtain the results we expect, leaving the possibility the a future version of a compiler with a new optimization will break code we th

@giwa
giwa / file0.txt
Last active December 6, 2023 07:36
Install g++/gcc 4.8.2 in CentOS 6.6 ref: http://qiita.com/giwa/items/28c754d8fc2936c0f6d2
$ wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo
$ yum install devtoolset-2-gcc devtoolset-2-binutils
$ yum install devtoolset-2-gcc-c++ devtoolset-2-gcc-gfortran