Skip to content

Instantly share code, notes, and snippets.

View tokestermw's full-sized avatar

Motoki Wu tokestermw

View GitHub Profile
import torch
import torch.nn as nn
import torch.nn.functional as F
# helpers
def make_unit_length(x, epsilon=1e-6):
norm = x.norm(p=2, dim=-1, keepdim=True)
return x.div(norm + epsilon)
import random
def augmentation_fun(x, augment_by=3):
# augment the original data point by 3
return [x + random.random() * 2 - 1 for i in range(augment_by)]
def train_loop(dataset, do_augment=False):
# emit one data point at a time
@tokestermw
tokestermw / play_elmo_embeddings_softmax.py
Last active September 6, 2018 21:27
Test code of `_ElmoSoftmax`.
"""
To use it inside ELMo script
To get the embeddings:
allennlp elmo sample_sents.txt out1.hdf5 --top
python -c "import h5py; f = h5py.File('out1.hdf5'); print(f['0'][:], f['0'].shape)"
To get probabilities:

Keybase proof

I hereby claim:

  • I am tokestermw on github.
  • I am motoki (https://keybase.io/motoki) on keybase.
  • I have a public key whose fingerprint is 26C6 F8AB C16D 50E4 3A97 05C2 B235 7159 51D6 074D

To claim this, I am signing this object:

Where A is a class (e.g. definite article), and B is another class (e.g. indefinite article). O is the null class.

The cat had a dog .
 A   O   O  B  O  O

v1

| Real | Predicted | Verdict

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@tokestermw
tokestermw / tf_percent_confusion_metric.py
Created May 2, 2018 21:15
Calculate a percentage from the confusion matrix in TensorFlow.
import tensorflow as tf
from tensorflow.python.ops.metrics_impl import _streaming_confusion_matrix
# almost the same as
def confusion_matrix(labels, predictions, num_classes, weights=None):
total_cm, update_op = _streaming_confusion_matrix(
labels, predictions, num_classes, weights=weights)
@tokestermw
tokestermw / birnnlm_pytorch.py
Last active May 30, 2020 08:29
Simple example of Bidirectional RNN Language Model in PyTorch. (blog post: https://medium.com/@plusepsilon/the-bidirectional-language-model-1f3961d1fb27)
import torch, torch.nn as nn
from torch.autograd import Variable
text = ['BOS', 'How', 'are', 'you', 'EOS']
seq_len = len(text)
batch_size = 1
embedding_size = 1
hidden_size = 1
output_size = 1
import random
def process_line(line):
columns = line.split('\t')
if len(columns) < 6:
return None
n_corrections = columns[0]
serial_number = columns[1]
url = columns[2]