[AVITI] Need to anonymize fastq header-> problem
The awk randomly adds: at the end of the quality score line. Fastqs are corrupted.
/work/project/PlaGe/data/Element/20250221_AVITI3_Marie_SideB_MARIE_TempSo_T-sperm_1740389401/nextflow/LANE2/PROJECT_T-sperm/T-sperm-Emseq/work/46/f454650ae59ba166e9af8445caeacb$ head -n 792 T29_S91_L002_R1_001.fastq.gz
TCCTCTCCTTCACATTAAAAATAATAAAATTCTCAAAACAAATAACCTCAAAAAATAAATTTAAATTTATAAATTTAAACAAACAAAACTTAACTTTAACCTTTATAAAAAACTTCAATATTTACTCTTCACTATCCTCAAAAAACTCTT
+
@HIIHIIIAIKMIMGLNFINFLBNMNIMNFINNJNNMBHN9KNNGLLCMJGHGNHLMLNENNKFCMNIN@MNNGMNJNMBN@NMMNNNIB4MMMM>NNNNLGHNNNMMKAKFNBDMINMMG4HJJMINFMNMGL;KLNNNMMNH?MKMJC:
Command line to check a fastq:
zcat T13_S85_L002_R2_001.fastq.gz | paste - - - - | awk -F"\t" '{ if (length($2) != length($4)) print $0 }' | wc -l
Edited by VERNETTE CAROLINE