MareArts Computer Vision Study.: transformers

Showing posts with label transformers. Show all posts

10/24/2023

hugging face onnx exporting model quantisation method

refer to example code

from functools import partial

from transformers import AutoTokenizer

from optimum.onnxruntime import ORTQuantizer, ORTModelForSequenceClassification

from optimum.onnxruntime.configuration import AutoQuantizationConfig, AutoCalibrationConfig

model_id = "distilbert-base-uncased-finetuned-sst-2-english"

onnx_model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

tokenizer = AutoTokenizer.from_pretrained(model_id)

quantizer = ORTQuantizer.from_pretrained(onnx_model)

qconfig = AutoQuantizationConfig.arm64(is_static=True, per_channel=False)

def preprocess_fn(ex, tokenizer):

    return tokenizer(ex["sentence"])

calibration_dataset = quantizer.get_calibration_dataset(

    "glue",

    dataset_config_name="sst2",

    preprocess_function=partial(preprocess_fn, tokenizer=tokenizer),

    num_samples=50,

    dataset_split="train",

)

calibration_config = AutoCalibrationConfig.minmax(calibration_dataset)

ranges = quantizer.fit(

    dataset=calibration_dataset,

    calibration_config=calibration_config,

    operators_to_quantize=qconfig.operators_to_quantize,

)

model_quantized_path = quantizer.quantize(

    save_dir="path/to/output/model",

    calibration_tensors_range=ranges,

    quantization_config=qconfig,

)

options for several instructions

optimum-cli onnxruntime quantize --help
usage: optimum-cli <command> [<args>] onnxruntime quantize [-h] --onnx_model ONNX_MODEL -o OUTPUT [--per_channel] (--arm64 | --avx2 | --avx512 | --avx512_vnni | --tensorrt | -c CONFIG)

options:
  -h, --help            show this help message and exit
  --arm64               Quantization for the ARM64 architecture.
  --avx2                Quantization with AVX-2 instructions.
  --avx512              Quantization with AVX-512 instructions.
  --avx512_vnni         Quantization with AVX-512 and VNNI instructions.
  --tensorrt            Quantization for NVIDIA TensorRT optimizer.
  -c CONFIG, --config CONFIG
                        `ORTConfig` file to use to optimize the model.

Required arguments:
  --onnx_model ONNX_MODEL
                        Path to the repository where the ONNX models to quantize are located.
  -o OUTPUT, --output OUTPUT
                        Path to the directory where to store generated ONNX model.

Optional arguments:
  --per_channel         Compute the quantization parameters on a per-channel basis.

refer to this page for details:

https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization#quantize-seq2seq-models

refer to this code as well

you may be able to get idea.

# Export to ONNX

model = ORTModelForSeq2SeqLM.from_pretrained(model_path, from_transformers=True, export=True, provider='CUDAExecutionProvider').to(device)

model.save_pretrained(onnx_path)

# quantization code

encoder_quantizer = ORTQuantizer.from_pretrained(onnx_path, file_name='encoder_model.onnx')

decoder_quantizer = ORTQuantizer.from_pretrained(onnx_path, file_name='decoder_model.onnx')

decoder_wp_quantizer = ORTQuantizer.from_pretrained(onnx_path, file_name='decoder_with_past_model.onnx')

quantizer = [encoder_quantizer, decoder_quantizer, decoder_wp_quantizer]

dqconfig = AutoQuantizationConfig.avx512_vnni(is_static=False, per_channel=False)

for q in quantizer:

    q.quantize(save_dir=output_path, quantization_config=dqconfig)

#inference code

model = ORTModelForSeq2SeqLM.from_pretrained(

      model_id=model_path,

      encoder_file_name='encoder_model_quantized.onnx',

      decoder_file_name='decoder_model_quantized.onnx',

      decoder_with_past_file_name='decoder_with_past_model_quantized.onnx',

      provider='CUDAExecutionProvider',

      use_io_binding=True,

).to(self.device)

tokenizer = AutoTokenizer.from_pretrained('google/flan-t5-large')

...

dataset = self.dataset(input_dict)

dataset.set_format(type='torch', device=self.device, columns=['input_ids', 'attention_mask'])

data_loader = DataLoader(dataset, batch_size=self.batch_size, collate_fn=self.data_collator)

generated_outputs: List[OUTPUT_TYPE] = []

for i, batch in enumerate(data_loader):

    _batch = {key: val.to(self.device) for key, val in batch.items()}

    outputs = self.model.generate(**_batch, generation_config=self.generation_config)

    decoded_outputs = self.tokenizer.batch_decode(outputs.cpu().tolist(), skip_special_tokens=True)

Thank you.

note! quantisation and optimise is different.

10/23/2023

comparing custom custom_vit_image_processor vs vit_image_processor of tranformers

check custom image process is same with origin inner processing function in transformers.

pixel_values1 = self.feature_extractor(images=image, return_tensors="pt").pixel_values

        # Convert numpy array to PyTorch tensor
        pixel_values2 = self.custom_vit_image_processor(image)
        pixel_values2 = torch.tensor(pixel_values2, dtype=torch.float32).unsqueeze(0)  # Add batch dimension and ensure float32 type

        # 1. Shape Check
        assert pixel_values1.shape == pixel_values2.shape, "The tensors have different shapes
        # 2. Absolute Difference
        diff = torch.abs(pixel_values1 - pixel_values2)

        # 3. Summarize Discrepancies www.marearts.com
        mean_diff = torch.mean(diff).item()
        max_diff = torch.max(diff).item()
        min_diff = torch.min(diff).item()
        print(f"Mean Absolute Difference: {mean_diff}")
        print(f"Maximum Absolute Difference: {max_diff}")
        print(f"Minimum Absolute Difference: {min_diff}")


        # Additionally, if you want to see where the maximum difference occurs:
        max_diff_position = torch.where(diff == max_diff)
        print(f"Position of Maximum Difference: {max_diff_position}")

Thank you.

Hope to helpful.

5/18/2023

swin transformer v2 - model forward and export onnx

1. load pre-trained model

2. export onnx

3. load onnx

refer to code:

import warnings
from torch.jit import TracerWarning
warnings.filterwarnings("ignore", category=TracerWarning)

#------------------
#swin-transformer v2 pretrained model
#------------------

from transformers import AutoImageProcessor, Swinv2Model
import torch
from datasets import load_dataset

dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]

image_processor = AutoImageProcessor.from_pretrained("microsoft/swinv2-tiny-patch4-window8-256")
model = Swinv2Model.from_pretrained("microsoft/swinv2-tiny-patch4-window8-256")

inputs = image_processor(image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state

# print( list(last_hidden_states.shape) )
# Convert last_hidden_states to numpy
last_hidden_states_numpy = last_hidden_states.detach().numpy()
print(f"Shape of last_hidden_states: {last_hidden_states_numpy.shape}")
print(last_hidden_states)



#----------------
#onnx export
#------------------
import torch
from torch.autograd import Variable

# ensure the model is in evaluation mode
model.eval()

# create a dummy variable with the same size as your input
# for this example, let's assume the input is of size [1, 3, 256, 256]
dummy_input = Variable(torch.randn(1, 3, 256, 256))

# specify the file path
file_path = "./swinv2_tiny.onnx"

# export the model
torch.onnx.export(model, dummy_input, file_path)

#------------------
#onnx inference
#------------------
import onnxruntime as ort

# load the ONNX model
ort_session = ort.InferenceSession(file_path)

# convert the PyTorch tensor to numpy array for onnxruntime
print(inputs.keys())
inputs_numpy = inputs["pixel_values"].numpy()
# inputs_numpy = inputs["input_ids"].numpy()

# create a dictionary from model input name to the actual input data
ort_inputs = {ort_session.get_inputs()[0].name: inputs_numpy}

# forward
ort_outs = ort_session.run(None, ort_inputs)
print(f"Shape of ort_outs: {ort_outs[0].shape}")
print(ort_outs)
# print(type(ort_outs))
# print( list(ort_outs.shape) )

Thank you.

www.marearts.com

🙇🏻‍♂️

5/13/2022

tokens to word, transformer

Refer to code to figure it out

how tokens consisted for a word.

Code show you tokens list for a word.

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("roberta-base")

example = "This is a tokenization example"

print('input sentence: ', example)
print('---')
print('tokens :')
print( tokenizer.encode(example, add_special_tokens=False, return_attention_mask=False, return_token_type_ids=False) )
print('---')
print('word and tokens :')
print({x : tokenizer.encode(x, add_special_tokens=False, return_attention_mask=False, return_token_type_ids=False) for x in example.split()})
print('---')
idx = 1
enc =[tokenizer.encode(x, add_special_tokens=False, return_attention_mask=False, return_token_type_ids=False) for x in example.split()]
desired_output = []
for token in enc:
    tokenoutput = []
    for ids in token:
        tokenoutput.append(idx)
        idx +=1
    desired_output.append(tokenoutput)

print('tokens in grouped list')
print(desired_output)
print('---')

input sentence:  This is a tokenization example
---
tokens :
[713, 16, 10, 19233, 1938, 1246]
---
word and tokens :
{'This': [713], 'is': [354], 'a': [102], 'tokenization': [46657, 1938], 'example': [46781]}
---
tokens in grouped list
[[1], [2], [3], [4, 5], [6]]
---

Thank you.

www.marearts.com

8/11/2021

ERROR: Could not build wheels for tokenizers which use PEP 517 and cannot be installed directly

I met this error when I install simpletransformers or transformers.

I did many try but no luck.

But this was solution to me.

Install Rust before transformer installation.

so..

curl https://sh.rustup.rs -sSf | bash -s -- -y
PATH="/root/.cargo/bin:${PATH}"

and install transformers

Thank you.

www.marearts.com

Pages

10/24/2023

hugging face onnx exporting model quantisation method

10/23/2023

comparing custom custom_vit_image_processor vs vit_image_processor of tranformers

5/18/2023

swin transformer v2 - model forward and export onnx

5/13/2022

tokens to word, transformer

8/11/2021

ERROR: Could not build wheels for tokenizers which use PEP 517 and cannot be installed directly