from transformers import pipeline
Hugging Faceのパイプライン
自然言語処理 NLP(Natural Language Processing)
- 文章の分類:レビューの評価、スパムメールの検出、文法的に正しいかどうかの判断、2つの文が論理的に関連しているかどうかの判断
- 文の中の単語分類:品詞(名詞、動詞、形容詞)や、固有表現(人、場所、組織)の識別
- 文章内容の生成:自動生成されたテキストによる入力テキストの補完、文章の穴埋め
- 文章からの情報抽出:質問と文脈が与えられたときの、文脈からの情報に基づいた質問に対する答えの抽出
- 文章の変換:ある文章の他の言語への翻訳、文章の要約
Hugging Face https://huggingface.co/ のパイプラインを使って色々なNLPの処理ができる。
- sentiment-analysis (感情分析)
- zero-shot-classification (ゼロショット分類)
- text-generation (文章生成)
- fill-mask (空所穴埋め)
- ner (named entity recognition) (固有表現認識)
- question-answering (質問応答)
- summarization (要約)
- translation (翻訳)
基本的な使い方は簡単であり、pipeline
のtask
引数にやりたいことを表す上の文字列を入れて、生成されたインスタンスに文字列を入れるだけである。
感情分析
与えられた文章が POSITIVE
かNEGATIVE
かを返す。
= pipeline("sentiment-analysis")
classifier "We are very happy to show you the 🤗 Transformers library.") classifier(
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
[{'label': 'POSITIVE', 'score': 0.9997795224189758}]
ゼロショット分類
例を示すことなく、与えられた文章を分類する。分類したいラベルのリストを、引数 candidate_labels
で与える。
= pipeline("zero-shot-classification")
classifier2
classifier2("This is a course about the Transformers library",
=["education", "politics", "business"],
candidate_labels )
No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
{'sequence': 'This is a course about the Transformers library',
'labels': ['education', 'business', 'politics'],
'scores': [0.8445950150489807, 0.11197729408740997, 0.0434277318418026]}
文章生成
与えた文章の続きを書く。
= pipeline("text-generation")
generator "In this course, we will teach you how to") generator(
No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
[{'generated_text': 'In this course, we will teach you how to run a database with Nginx and PHP. We first take a look at how to run PHP and Nginx together. Then we will use an example MySQL database to create a database. In the same'}]
pipeline
のモデル引数model
で、使用するモデルを指定することもできる。 モデルは、https://huggingface.co/models から適当なものを選択する必要がある。
また、最大トークン数をmax_length
、生成する文章の数をnum_return_sequences
で与えることもできる。
= pipeline("text-generation", model="distilgpt2")
generator
generator("In this course, we will teach you how to",
=30,
max_length=2,
num_return_sequences )
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
[{'generated_text': 'In this course, we will teach you how to make mistakes as well as avoid them all because they cost you money, and why it makes good money'},
{'generated_text': 'In this course, we will teach you how to understand the best, most effective and most effective ways to perform the work of the American people. These'}]
空所穴埋め
与えた文章内の<mask>
の部分に単語で埋めて文章にする。引数top_k
で埋める単語数を与えることができる。
= pipeline("fill-mask") unmasker
No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
"This course will teach you all about <mask> models.", top_k=2) unmasker(
[{'score': 0.1961977630853653,
'token': 30412,
'token_str': ' mathematical',
'sequence': 'This course will teach you all about mathematical models.'},
{'score': 0.04052729532122612,
'token': 38163,
'token_str': ' computational',
'sequence': 'This course will teach you all about computational models.'}]
固有表現認識
固有表現認識 ner (named entity recognition) とは、文章内の 人(PER: persons)、場所(LOC: locations)、組織(ORG: organizations)などを抽出するタスクである。
引数grouped_entities
をTrue
に設定すると固有名詞を結合して出力する。
= pipeline("ner", grouped_entities=True)
ner "My name is Sylvain and I work at Hugging Face in Brooklyn.") ner(
No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
[{'entity_group': 'PER',
'score': 0.9981694,
'word': 'Sylvain',
'start': 11,
'end': 18},
{'entity_group': 'ORG',
'score': 0.9796021,
'word': 'Hugging Face',
'start': 33,
'end': 45},
{'entity_group': 'LOC',
'score': 0.9932106,
'word': 'Brooklyn',
'start': 49,
'end': 57}]
質問応答
質問をquestion
、文章をcontext
で与えることによって、質問の答えと、その単語の開始位置と終了位置を返す。
= pipeline("question-answering")
question_answerer
question_answerer(="Where do I work?",
question="My name is Sylvain and I work at Hugging Face in Brooklyn",
context )
No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
{'score': 0.6949763894081116, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}
要約
文章の要約を返す。
= pipeline("summarization")
summarizer
summarizer("""
America has changed dramatically during recent years. Not only has the number of
graduates in traditional engineering disciplines such as mechanical, civil,
electrical, chemical, and aeronautical engineering declined, but in most of
the premier American universities engineering curricula now concentrate on
and encourage largely the study of engineering science. As a result, there
are declining offerings in engineering subjects dealing with infrastructure,
the environment, and related issues, and greater concentration on high
technology subjects, largely supporting increasingly complex scientific
developments. While the latter is important, it should not be at the expense
of more traditional engineering.
Rapidly developing economies such as China and India, as well as other
industrial countries in Europe and Asia, continue to encourage and advance
the teaching of engineering. Both China and India, respectively, graduate
six and eight times as many traditional engineers as does the United States.
Other industrial countries at minimum maintain their output, while America
suffers an increasingly serious decline in the number of engineering graduates
and a lack of well-educated engineers.
"""
)
No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
[{'summary_text': ' America has changed dramatically during recent years . The number of engineering graduates in the U.S. has declined in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering . Rapidly developing economies such as China and India continue to encourage and advance the teaching of engineering .'}]
翻訳
翻訳した文章を返す。 pipeline
のモデル引数model
に翻訳をするためのモデルを入れる。 以下の例では、英語からフランス語への翻訳モデルを指定している。 (ドイツ語への翻訳の場合には、translation_en_to_de
をtask引数とする。)
= pipeline("translation_en_to_fr")
translator "This course is produced by Hugging Face.") translator(
No model was supplied, defaulted to t5-base and revision 686f1db (https://huggingface.co/t5-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
[{'translation_text': 'Ce cours est produit par Hugging Face.'}]
Google Colab.上でモデルを指定して翻訳を行う場合には、以下を実行してsentencepiece
をインストールしてから、 カーネルをリスタートする必要がある。
!pip install sentencepiece
以下のコードはHelsinki-NL
のモデルを用いて、様々な言語間の翻訳を行う。 例として、英語から日本語への翻訳を示す。
def create_translation_pipeline(source_lang, target_lang):
= f'Helsinki-NLP/opus-mt-{source_lang}-{target_lang}'
model_name = pipeline("translation", model=model_name)
translator return translator
def translate_text(translator, text):
= translator(text, max_length=500)
result return result[0]['translation_text']
# Example usage:
= "en" # English
source_lang_code = "jap" # Japanese
target_lang_code
= create_translation_pipeline(source_lang_code, target_lang_code)
translator
= "This is a pen."
english_text = translate_text(translator, english_text)
translated_text
print(f"{source_lang_code.capitalize()}: {english_text}")
print(f"{target_lang_code.capitalize()}: {translated_text}")
En: This is a pen.
Jap: これ は 筆 で あ る .
仕組みの詳細
pipeline
の中身は、以下の処理に分解される。
文字列 => トークナイザー => モデル => 後処理
from transformers import AutoTokenizer
from transformers import AutoModel
from pprint import pprint
from transformers import AutoModelForSequenceClassification
import torch
トークナイザー
まず、入力された文字列をトークン(単語や記号など)に分割し、各トークンを整数に置き換える必要がある。 これには、AutoTokenizer
クラスのfrom_pretrained
メソッドを使用する。 引数には、https://huggingface.co/models にあるモデル名 checkpoint
を与える。
= "distilbert-base-uncased-finetuned-sst-2-english"
checkpoint = AutoTokenizer.from_pretrained(checkpoint) tokenizer
pprint(tokenizer)
DistilBertTokenizerFast(name_or_path='distilbert-base-uncased-finetuned-sst-2-english', vocab_size=30522, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=True)
生成したトークナーザーtokenizer
に文字列(のリスト)を与えると、変換された数値情報を含んだ辞書が生成される。 辞書のキーは、どのトークンに注意するかを表すattention_mask
と入力を数値に変換した多次元配列を表す input_ids
である。
この際、どの深層学習フレームワークを使うかを表すreturn_tensors
を指定する必要がある。 ここでは、PyTorchを使うので、引数にpt
を指定する。
= [
raw_inputs "I've been waiting for a HuggingFace course my whole life.",
"I hate this so much!",
]= tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
inputs pprint(inputs)
{'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]]),
'input_ids': tensor([[ 101, 1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172,
2607, 2026, 2878, 2166, 1012, 102],
[ 101, 1045, 5223, 2023, 2061, 2172, 999, 102, 0, 0,
0, 0, 0, 0, 0, 0]])}
モデル
続いてモデルクラスのインスタンスを生成する。 ここでは、AutoModel
クラスのfrom_pretrained
メソッドを使用する。
ここで生成したモデルは、トランスフォーマーの基本部分だけをもち、出力は入力の特徴を抽出した多次元配列(テンソル)である。
= "distilbert-base-uncased-finetuned-sst-2-english"
checkpoint = AutoModel.from_pretrained(checkpoint) model
Some weights of the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing DistilBertModel: ['classifier.bias', 'pre_classifier.bias', 'pre_classifier.weight', 'classifier.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
pprint(model)
DistilBertModel(
(embeddings): Embeddings(
(word_embeddings): Embedding(30522, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(transformer): Transformer(
(layer): ModuleList(
(0-5): 6 x TransformerBlock(
(attention): MultiHeadSelfAttention(
(dropout): Dropout(p=0.1, inplace=False)
(q_lin): Linear(in_features=768, out_features=768, bias=True)
(k_lin): Linear(in_features=768, out_features=768, bias=True)
(v_lin): Linear(in_features=768, out_features=768, bias=True)
(out_lin): Linear(in_features=768, out_features=768, bias=True)
)
(sa_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(ffn): FFN(
(dropout): Dropout(p=0.1, inplace=False)
(lin1): Linear(in_features=768, out_features=3072, bias=True)
(lin2): Linear(in_features=3072, out_features=768, bias=True)
(activation): GELUActivation()
)
(output_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
)
)
)
)
トークナイザーで生成した辞書を展開してモデルに入力すると、PyTorchのテンソルが出力されていることが確認できる。
= model(**inputs)
outputs print(outputs.last_hidden_state.shape)
torch.Size([2, 16, 768])
outputs
BaseModelOutput(last_hidden_state=tensor([[[-0.1798, 0.2333, 0.6321, ..., -0.3017, 0.5008, 0.1481],
[ 0.2758, 0.6497, 0.3200, ..., -0.0760, 0.5136, 0.1329],
[ 0.9046, 0.0985, 0.2950, ..., 0.3352, -0.1407, -0.6464],
...,
[ 0.1466, 0.5661, 0.3235, ..., -0.3376, 0.5100, -0.0561],
[ 0.7500, 0.0487, 0.1738, ..., 0.4684, 0.0030, -0.6084],
[ 0.0519, 0.3729, 0.5223, ..., 0.3584, 0.6500, -0.3883]],
[[-0.2937, 0.7283, -0.1497, ..., -0.1187, -1.0227, -0.0422],
[-0.2206, 0.9384, -0.0951, ..., -0.3643, -0.6605, 0.2407],
[-0.1536, 0.8988, -0.0728, ..., -0.2189, -0.8528, 0.0710],
...,
[-0.3017, 0.9002, -0.0200, ..., -0.1082, -0.8412, -0.0861],
[-0.3338, 0.9674, -0.0729, ..., -0.1952, -0.8181, -0.0634],
[-0.3454, 0.8824, -0.0426, ..., -0.0993, -0.8329, -0.1065]]],
grad_fn=<NativeLayerNormBackward0>), hidden_states=None, attentions=None)
今度は、実際に感情分析を行うための層を含んだモデルを、 AutoModelForSequenceClassification
クラスを用いて生成する。
出力のlogits
に保管されているテンソルが得られた数値である。
= "distilbert-base-uncased-finetuned-sst-2-english"
checkpoint = AutoModelForSequenceClassification.from_pretrained(checkpoint)
model2 = model2(**inputs) outputs2
outputs2
SequenceClassifierOutput(loss=None, logits=tensor([[-1.5607, 1.6123],
[ 4.1692, -3.3464]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)
後処理
得られたテンソルをソフトマック関数を用いて確率に変換する。これが予測値になる。
= torch.nn.functional.softmax(outputs2.logits, dim=-1)
predictions print(predictions)
tensor([[4.0195e-02, 9.5981e-01],
[9.9946e-01, 5.4418e-04]], grad_fn=<SoftmaxBackward0>)
最初の文の予測値は [0.0402, 0.9598]、2番目の文の予測値は[0.9995, 0.0005]である。 これは最初の文は、1である確率が高く、2番目の文は0である確率が高いことを示している。
モデルで用いられたラベルを得るには、モデルのid2label
属性をみる。
model2.config.id2label
{0: 'NEGATIVE', 1: 'POSITIVE'}
したがって、最初の文章はPOSITIVE
、2番目の文章はNEGATIVE
であると判定される。
コンピュータビジョン
画像分類
以下の画像を例として用いる。
= "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg" image_example1
= pipeline(task="image-classification")
vision_classifier
= vision_classifier(images =image_example1)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds preds
No model was supplied, defaulted to google/vit-base-patch16-224 and revision 5dca96d (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.
[{'score': 0.4335, 'label': 'lynx, catamount'},
{'score': 0.0348,
'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'},
{'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'},
{'score': 0.0239, 'label': 'Egyptian cat'},
{'score': 0.0229, 'label': 'tiger cat'}]
物体検出
以下を実行して追加パッケージをインストールする必要がある。
!pip install timm
from transformers import pipeline
= pipeline(task="object-detection")
detector = detector(image_example1)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
preds preds
No model was supplied, defaulted to facebook/detr-resnet-50 and revision 2729413 (https://huggingface.co/facebook/detr-resnet-50).
Using a pipeline without specifying a model name and revision in production is not recommended.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
The `max_size` parameter is deprecated and will be removed in v4.26. Please specify in `size['longest_edge'] instead`.
[{'score': 0.9864,
'label': 'cat',
'box': {'xmin': 178, 'ymin': 154, 'xmax': 882, 'ymax': 598}}]
画像セグメンテーション
= pipeline(task="image-segmentation")
segmenter = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds print(*preds, sep="\n")
No model was supplied, defaulted to facebook/detr-resnet-50-panoptic and revision fc15262 (https://huggingface.co/facebook/detr-resnet-50-panoptic).
Using a pipeline without specifying a model name and revision in production is not recommended.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
`label_ids_to_fuse` unset. No instance will be fused.
{'score': 0.9879, 'label': 'LABEL_184'}
{'score': 0.9973, 'label': 'snow'}
{'score': 0.9972, 'label': 'cat'}
深さ推定
= pipeline(task="depth-estimation", model="Intel/dpt-large")
estimator = estimator(images=image_example1)
result result
Some weights of DPTForDepthEstimation were not initialized from the model checkpoint at Intel/dpt-large and are newly initialized: ['neck.fusion_stage.layers.0.residual_layer1.convolution1.weight', 'neck.fusion_stage.layers.0.residual_layer1.convolution2.weight', 'neck.fusion_stage.layers.0.residual_layer1.convolution1.bias', 'neck.fusion_stage.layers.0.residual_layer1.convolution2.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
{'predicted_depth': tensor([[[ 0.7999, 0.8382, 0.8483, ..., 2.3091, 2.3669, 2.3291],
[ 0.8054, 0.8101, 0.8106, ..., 2.3390, 2.3357, 2.3307],
[ 0.8580, 0.8359, 0.8457, ..., 2.3557, 2.3509, 2.3599],
...,
[26.3410, 26.4059, 26.3881, ..., 17.5088, 17.4768, 17.4148],
[26.4727, 26.4515, 26.5042, ..., 17.4223, 17.3911, 17.4052],
[26.5116, 26.5452, 26.5301, ..., 17.4719, 17.4700, 17.4025]]]),
'depth': <PIL.Image.Image image mode=L size=960x686>}
音声
以下の演説の音声ファイルを用いる。
= "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac" audio_example1
音声分類
モデルにsuperb/hubert-base-superb-er
を使うと音声の感情を分類し、 MIT/ast-finetuned-audioset-10-10-0.4593
を使うと音声の種類を分類する。
# #classifier = pipeline(task="audio-classification", model="superb/hubert-base-superb-er")
= pipeline(task="audio-classification", model="MIT/ast-finetuned-audioset-10-10-0.4593")
classifier = classifier(audio_example1)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds preds
/usr/local/lib/python3.10/dist-packages/transformers/models/audio_spectrogram_transformer/feature_extraction_audio_spectrogram_transformer.py:96: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:206.)
waveform = torch.from_numpy(waveform).unsqueeze(0)
[{'score': 0.4208, 'label': 'Speech'},
{'score': 0.1793, 'label': 'Rain on surface'},
{'score': 0.1301, 'label': 'Rain'},
{'score': 0.096, 'label': 'Raindrop'},
{'score': 0.0578, 'label': 'Music'}]
音声認識
= pipeline(task="automatic-speech-recognition", model="openai/whisper-small")
transcriber transcriber(audio_example1)
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}