GIST-Embedding-v0 base trained on 4786 en-tr vacancy pairs

This is a sentence-transformers model finetuned from avsolatorio/GIST-Embedding-v0. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: avsolatorio/GIST-Embedding-v0
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: mit

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("avsolatorio/GIST-Embedding-v0_en-tr_jobs")
# Run inference
sentences = [
    'HR-ELEKT Elektrik Teknisyeni Mels Turizm Otelcilik Rest. Eğlence Hizm.San. Tic. İstanbul(Avr.) Ortaköy - Kuruçeşme de faaliyet gösteren işletmemiz için tecrübeli Sıhhi Tesisat ve Elektrik Bakım Teknisyeni arayışımız bulunmaktadır. ELEKTRİK TEKNİSYENİ Endüstri Meslek Lisesi / Meslek Yüksek Okulu Elektrik bölümü mezunu, Elektrik bakım teknisyenliği konusunda min. 3 yıl tecrübeli, Bina bakım onarım üzerine tecrübesi olan, Tercihen mekanik bakım hakkında bilgi sahibi, Askerlikle ilişiği olmayan, Vardiyalı çalışabilecek,\xa0',
    'HR-ELEKT Electrical Technician Mels Tourism Hotel Management Rest. Entertainment Hizm.San. Tic. Istanbul(Avr.) We are looking for an experienced Plumbing and Electrical Maintenance Technician for our business operating in Ortaköy - Kuruçeşme. ELECTRICAL TECHNICIAN Industrial Vocational High School / Vocational High School Electrical Department graduate, Minimum 3 years of experience in electrical maintenance technician, Experience in building maintenance and repair, Preferably knowledgeable about mechanical maintenance, Not related to military service, Able to work in shifts,\xa0',
    'DURU006 PUBLIC RELATIONS ASSISTANT BETA MED DURU POLYCLINIC Hatay ·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Execution of the public relations process,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Keeping customer satisfaction at the highest level,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Supporting cash transactions. public relations customer satisfaction cash register cashier BETA MED DURU POLYCLINIC, one of the leading brands in the field of health and beauty in Iskenderun, is looking for colleagues with the following characteristics;\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 At least a high school graduate,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Compatible with teamwork,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Adopting a customer satisfaction-oriented service approach,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Friendly, dynamic, strong human relations,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Knowing how to use cash registers and computers, ·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Able to adapt to intense pace and flexible working hours,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Advanced communication skills,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Taking care of oneself,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Attaching importance to business ethics and work discipline,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 \xa0Able to take responsibility,·\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Preferably having worked in a similar position before.\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 Residing in Iskenderun.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

  • Datasets: en-tr-jobs-validation, en-tr-jobs-test, en-tr-jobs-test and en-tr-jobs-test
  • Evaluated with TripletEvaluator
Metric en-tr-jobs-validation en-tr-jobs-test
cosine_accuracy 1.0 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,786 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 79 tokens
    • mean: 388.99 tokens
    • max: 512 tokens
    • min: 32 tokens
    • mean: 256.64 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    TI-OB70635 İşyeri Hekimi Eymen İşgüvenliği Hizmetleri Ltd. Kocaeli, İstanbul(Asya) İşyeri Hekimliği konusunda deneyimli, Tam zamanlı veya yarı zamanlı görevlendirilecek İstanbul'da görevlendirileceği firmalara seyahat engeli olmayan Ofis programlarına hakim, raporlama yapabilecek Sunum ve eğitim yeteneği olan Görevlendirildiği firmalarda uyum içerisinde çalışabilecekİşyeri Hekimleri aranmaktadır... işyeri hekimi Çalışma ve Sosyal Güvenlik Bakanlığı İşyeri Hekimliği sertifikası olan, İnsan ilişkilerinde başarılı, sözlü ve yazılı iletişim becerileri kuvvetli olan,  Tam zamanlı veya yarı zamanlı olarak çalışabilecek, B sınıfı ehliyete sahip ve aktif araç kullanabilen TI-OB70635 Workplace Physician Eymen Occupational Safety Services Ltd. Kocaeli, Istanbul (Asia) Experienced in Workplace Medicine, Full-time or part-time to be assigned to the companies to be assigned in Istanbul, Not having travel barriers, Having a good command of office programs, Able to report, Having the ability to report Workplace Physicians who can work in harmony in the companies they are assigned to... Workplace physician Ministry of Labor and Social Security Workplace Medicine certificate, Successful in human relations, strong verbal and written communication skills, Able to work full-time or part-time, Have a class B driver's license and drive actively
    RF-AQ67934 Boya Ustası turunçgil teknik makina imalat sanayi ve ticaret ltd. şti. Mersin

    BOYA USTASI

    DENEYİMLİ BOYA USTASI ARANMAKTADIR.

    SERVİS VE YEMEK İMKANIMIZ MEVCUTTU.

    MESAİ İMKANIMIZDA MEVCUTTUR.

    DENEYİMLİ ÇALIŞKAN
    RF-AQ67934 Paint Master Citrus Technical Machinery Manufacturing Industry and Trade Ltd. Sti. Myrtle

    PAINT MASTER

    EXPERIENCED PAINT MASTER IS WANTED.

    WE HAD SERVICE AND FOOD FACILITIES.

    IT IS AVAILABLE IN OUR WORKING HOURS.

    EXPERIENCED, HARDWORKING
    RF-MS31478 Teklif Mühendisi DAL TEKNİK MAKİNA A.Ş. İstanbul(Asya),İstanbul(Avr.)

     “We are proud and privileged to serve humanity with our knowledge and technology.”

    DAL TEKNİK MAKİNA (a DAL ENGINEERING GROUP Company) is a worldwide technology and engineering trailer in the cement industry, looking for a Proposal Engineer. We offer a very dynamic and innovative working environment.

    JOB DESCRIPTION

    * Prepare proposals for bids of the turnkey projects and main equipment in Cement Technology like Pyro Process, Raw Materials Handling, Grinding Equipment, Separators, etc.
    * Preferably a basic understanding of multi-disciplines Mechanical, Automation, Civil, etc.
    * Innovative developments during the design phase to decrease the CAPEX and OPEX,
    * Preparation of equipment list & sizing, pricing, and detailed technical offer for proposal letter
    * Reviewing requests for proposals
    * Outlining project specifications
    * Collaborating with other departments
    * Developing cost estimates for proj...
    RF-MS31478 Teklif Mühendisi DAL TEKNİK MAKİNA A.Ş. İstanbul(Asya),İstanbul(Avr.)

     “We are proud and privileged to serve humanity with our knowledge and technology.”

    DAL TEKNİK MAKİNA (a DAL ENGINEERING GROUP Company) is a worldwide technology and engineering trailer in the cement industry, looking for a Proposal Engineer. We offer a very dynamic and innovative working environment.

    JOB DESCRIPTION

    * Prepare proposals for bids of the turnkey projects and main equipment in Cement Technology like Pyro Process, Raw Materials Handling, Grinding Equipment, Separators, etc.
    * Preferably a basic understanding of multi-disciplines Mechanical, Automation, Civil, etc.
    * Innovative developments during the design phase to decrease the CAPEX and OPEX,
    * Preparation of equipment list & sizing, pricing, and detailed technical offer for proposal letter
    * Reviewing requests for proposals
    * Outlining project specifications
    * Collaborating with other departments
    * Developing cost estimates for proj...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 100 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 100 samples:
    anchor positive
    type string string
    details
    • min: 82 tokens
    • mean: 375.65 tokens
    • max: 512 tokens
    • min: 50 tokens
    • mean: 244.45 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    ManTor001 Üniversal Torna Ustası ANR MAKİNA SANAYİ İÇ VE DIŞ TİCARET LTD.ŞTİ İstanbul(Avr.) ÜniversalTorna konusunda Tecrübeli ve Sorumluluk sahibi,Teknik resim okumasını bilen toleranslara hakim,Tercihen minimum Meslek Lisesi Mezunu,Ölçü aletlerini kullanmasını bilen,Askerlik görevini tamamlamış,Min. Sektöründe 1 yıl deneyimli,20-50yaşları arasında bay,Avrupa yakasında Tercihen Beylikdüzü, Esenyurt , Hadımköy ,B.çekmece Avcılar da ikamet eden,Mesleğini seven, gelişime açık, Takım çalışmasına yatkın CNC Torna Operatörü arıyoruz. Üniversal unıversal torna freze usta formen teknik operator Savunma Sanayi ve Hidrolik Silindir sektöründe 2008 den günümüze faaliyet gösteren Esenyurt /Kıraç da bulunan fabrikamızda görevlendirmek üzere, aşağıdaki niteliklere sahipÜniversal Torna Ustaları aramaktayız. ManTor001 Universal Lathe Master ANR MAKİNA SANAYİ İç ve Dış TİCARET LTD.ŞTİ Istanbul(Avr.) Experienced and Responsible in Universal Lathe, Knowing how to read technical drawings, Knowing tolerances, Preferably minimum Vocational High School Graduate, Knowing how to use measuring instruments, Completed military service, Min. 1 year experienced in the sector, male between the ages of 20-50, Preferably on the European side, Beylikdüzü, Esenyurt, Hadımköy, B.çekmece We are looking for a CNC Lathe Operator who loves his profession, is open to development, prone to teamwork. We are looking for Universal Lathe Masters with the following qualifications to be assigned to our factory in Esenyurt / Kıraç, which has been operating in the Defense Industry and Hydraulic Cylinder sector since 2008.
    RF-VH5972 Satış ve Pazarlama Müdürü Tuğlar Turizm Otelcilik Gıda Tasıma San ve Tic Lt Kars

    Kars Sarıkamış Kayak Merkezinde yer alan 5 yıldızlı tam pansiyon konsepti olan ''Sarpino Mountain Hotel''  bünyesinde  ‘Satış ve Pazarlama Müdürü” olarak görev alacak takım arkadaşı aramaktayız. 

     Üniversitelerin ilgili bölümlerinden mezun, tercihen Turizm İşletme vb.

    ·         Otelde uygulanacak fiyatlar konusunda genel yönetim ile periyodik toplantılar yapmak, çevre otellerin fiyatlarını da inceleyerek uygulanacak fiyatları saptamak, yönetimin onayına sunmak ve onayı takiben yeni fiyatların duyurulması konusunda çalışma yapmak,

    ·         Yurt içinde ve Yurt dışında otelin imajını en iyi şekilde tanıtmak,

    ·         Online satış kanalları konusunda tecrübeli ve fiyat takibini yapabilecek,

    ·         Acenta ve kurumsal müşteri portföyü olan,

    ·         Alanında en az 3 yıl deneyim sahibi,

    ·         İyi derecede İngilizce bilen, Elektra ve Office programlarına hakim,

    ·         Güler yüzlü, d...
    RF-VH5972 Sales and Marketing Manager Tuğlar Tourism Hotel Management Food Transport San ve Tic Lt Kars

    We are looking for a teammate to work as "Sales and Marketing Manager" within the "Sarpino Mountain Hotel", which is a 5-star full-board concept located in Kars Sarıkamış Ski Center. 

    Graduated from relevant departments of universities, preferably Tourism Management, etc.

    ·         To hold periodic meetings with the general management about the prices to be applied in the hotel, to determine the prices to be applied by examining the prices of the surrounding hotels, to submit them to the approval of the management and to work on the announcement of new prices following the approval,

    ·         To promote the image of the hotel in the best way at home and abroad,

    ·         Experienced in online sales channels and able to follow prices,

    ·         Having an agency and corporate customer portfolio,

    ·         At least 3 years of experience in the field,

    ·         Good command of En...
    RF-GX25253 Çay Servis ve Temizlik Elemanı DD GRUP İNŞAAT SAN. TİC. LTD. ŞTİ İstanbul(Asya)

    İş Tanımı
    ---------

    DD Grup Yönetim Ofisi bünyesinde ,  İstanbul / Kartal lokasyonunda görevlendirilmek üzere Çay Servisi & Temizlik Personeli elemanı alınacaktır.

    Aranan Nitelikler
    -----------------

    - İçecek Servisi & Temizlik işi yapabilecek,
    - Hijyen kurallarına önem veren,
    - Temiz, titiz ve düzenli,
    - Pozitif, güleryüzlü ve dinamik,
    - Tercihen Pendik/Kartal/Maltepe/Bostancı  bölgesinde ikamet eden çalışma arkadaşlarına ihtiyacımız bulunmaktadır.

    Çay Servis ve Temizlik Elemanı Titiz Çay Servis ve Temizlik Elemanı
    RF-GX25253 Tea Service and Cleaning Staff DD GRUP İNŞAAT SAN. TIC. LTD. STİ Istanbul (Asia)

    Job Description
    ---------

    Within the DD Group Management Office, Tea Service & Cleaning Personnel will be recruited to be assigned to the Istanbul / Kartal location.

    Required Qualifications
    -----------------

    - Will be able to do Beverage Service & Cleaning work,
    - Attaching importance to hygiene rules,
    - Clean, meticulous and tidy,
    - Positive, friendly and dynamic,
    - We need colleagues who preferably reside in Pendik/Kartal/Maltepe/Bostancı region.

    Tea Service and Cleaning Staff Titiz Tea Service and Cleaning Staff
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • load_best_model_at_end: True
  • push_to_hub: True
  • hub_private_repo: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: True
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss en-tr-jobs-validation_cosine_accuracy en-tr-jobs-test_cosine_accuracy
-1 -1 - - 0.9800 -
0.3333 100 0.2176 0.0000 1.0 -
0.6667 200 0.0013 0.0000 1.0 -
1.0 300 0.0011 0.0 1.0 -
-1 -1 - - - 1.0
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.6
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.6.0
  • Accelerate: 1.3.0
  • Datasets: 3.3.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
7
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avsolatorio/GIST-Embedding-v0_en-tr_jobs

Finetuned
(4)
this model

Evaluation results