virgintrain1
Member
- Joined
- 29 Jul 2011
- Messages
- 212
Such amazing news, now all we need is those brilliant people that created that spreadsheet for the ScotRail announcements!
Thank you. Much appreciated.I'll do it anyway for the announcement site
It will take weeks and weeks to label them all. Unless we have a huge amount of people helping. With the Scotrail files there were only just over 1,000 files. With these there are almost 30,000 files. It is an enormous task.Such amazing news, now all we need is those brilliant people that created that spreadsheet for the ScotRail announcements!
I also had luck with the slightly shorter:Edit - and for anyone playing along at home, this is the requisite hideous command line:
find -iname "*.SEG" -exec bash -c "dd if=\""\{\}"\" of=/dev/stdout bs=1 skip=48 status=none | sox -t raw -b 8 -r 16000 -e a-law -X - output/\""\{\}"\".wav" \;
find -iname "*.SEG" -exec bash -c "tail -c +48 '{}' | sox -t raw -b 8 -r 16000 -e a-law -X - '{}.wav'" \;
07-12 are all one audio file - saying ‘For safety reasons the train station must be closed. Please leave the station as quickly as possible at the next available exit. Please take your luggage with you.'
Also remember that Ketech provide the Eurostar station announcements too. So i reckon the German announcements are for London St Pancras and Ebbsfleet and Ashford in case German services ever start. There are a bunch of French announcements in there too. I think the Dutch announcements are probably somewhere in there too.I also had luck with the slightly shorter:
Bash:find -iname "*.SEG" -exec bash -c "tail -c +48 '{}' | sox -t raw -b 8 -r 16000 -e a-law -X - '{}.wav'" \;
There's a few oddities in the dataset...
For example, a German announcement (well very many starting from 9100 to 9112):
View attachment 142755
From my German-speaking friend:
I assume this would be used for an airport/international station, e.g. Gatwick.
but she didn't say 'only'!The "Only Woman" lady is the voice that was previously used at, Brighton, East Croydon, Gatwick Airport, London Victoria, and all Virgin Trains managed stations from around 2012 up until around 2018 when she was very unfortunately removed. She was called "Only Woman" because when she was first installed they accidentally set up the system wrong so that she said "only" at the end of every announcement even if the train was calling at multiple stations. Nobody knows who she is. We have been unable to find out her real name. Personally she was my favourite voiceover artist. I loved her announcements. She sounded great.
Also for anyone who has not heard her before here is an example of an "Only Woman" voiced announcement attached below.
Originally she ended all of her announcements with "Only", even if the calling pattern had several stops. For example, she would originally say "Calling at: Lockerbie, Motherwell and Glasgow Central only!"but she didn't say 'only'!
No. No they are not. Most of them match, but not all.The IDs are the same across all voices,
Such amazing news, now all we need is those brilliant people that created that spreadsheet for the ScotRail announcements!
Yes the .pcc files for the Piccadilly Line 1973 Stock are the other ones that have been causing an issue for ages.This is great work, it’s certainly evocative that these two voices can be preserved for posterity.
One quick question - is anyone able to have a go at the *.pcc file that TFL have provided for the Piccadilly Line Julie Berry 73 stock announcements?
I tried running the Male1 segments through voice recognition (Whisper base.en) in the hope that the result might save some time.
I'm not sure it has, but the results are hilarious. Highlights include:
167: The Red Funnel Fairy Service, too
412: Happily Bridge
413: our broath
484: Broccoli wins
495: button on trend
523: Clap 'em
560: Dumb freeze
583: Exodus and David's
595: Games bra
680: In the key thing
1034: Biflete and new whore
1039: Chit Chista
1054: M's worth?
1085: Have a Food West
and so on!
(attached is actually a CSV but the forum won't let me upload it as that).
Hahaha. Lol. There are some funny errors on that. I think most voice recognition software is only really designed to recognise ordinary words rather than place names. We have lots of places names with difficult pronunciations in the UK so that makes it a bit difficult.I tried running the Male1 segments through voice recognition (Whisper base.en) in the hope that the result might save some time.
I'm not sure it has, but the results are hilarious. Highlights include:
167: The Red Funnel Fairy Service, too
412: Happily Bridge
413: our broath
484: Broccoli wins
495: button on trend
523: Clap 'em
560: Dumb freeze
583: Exodus and David's
595: Games bra
680: In the key thing
1034: Biflete and new whore
1039: Chit Chista
1054: M's worth?
1085: Have a Food West
and so on!
(attached is actually a CSV but the forum won't let me upload it as that).
Yes the .pcc files for the Piccadilly Line 1973 Stock are the other ones that have been causing an issue for ages.
For anyone who does not have them here are the .pcc files:
LU Piccadilly Line Announcements 2019
MediaFire is a simple to use free service that lets you put all your photos, documents, music, and video in a single place so you can access them anywhere and share them everywhere.www.mediafire.com
The zip folder just contains one file called copyfile.pcc but it is impossible to open the file.
On some softwares (such as freefileviewer which i had used) it will show what looks to be the individual sound files within the .pcc file. I am not really an expert at this but i am guessing these are all of the sound files showing the bytes. This is a few examples of what it shows:
View attachment 142777
View attachment 142778
View attachment 142779
View attachment 142780
But actually accessing and playing the sound files is impossible. This seems to be a more complicating issue than the .seg files as at least with the .seg files we had access to each file but we just could not play them. But with these we can not even access the individual sound files.
Interestingly these .pcc files are also provided by Ketech just like the .seg files so it seems that Ketech like using unusual file formats.
At least we have the .seg files sorted now and have access to the Celia Drummond and Phil Sayer files which is amazing but hopefully somebody will know how to access these .pcc files too.
It get's most of the station names right, could you try that with the Celia files maybe? Is it quite quick to do?I tried running the Male1 segments through voice recognition (Whisper base.en) in the hope that the result might save some time.
I'm not sure it has, but the results are hilarious. Highlights include:
167: The Red Funnel Fairy Service, too
412: Happily Bridge
413: our broath
484: Broccoli wins
495: button on trend
523: Clap 'em
560: Dumb freeze
583: Exodus and David's
595: Games bra
680: In the key thing
1034: Biflete and new whore
1039: Chit Chista
1054: M's worth?
1085: Have a Food West
and so on!
(attached is actually a CSV but the forum won't let me upload it as that).
Most of the files have the same contents for all voicescould you try that with the Celia files maybe
Thank you for doing the "female1" folder too. That is much appreciated. Some more hilarious transcribed recordings! I guess the software does not recognise British accents that well!OK, here's a badly auto-transcribed Female1 too. Some highlights:
478: Bristol Temple Meats
505: You can't count it
555: Dor just to south
566: Feeling Broadway
736: Lost with you
750: Market Hi-Brah
6459: Glundal sins in Crosby
6909: One Bruh!
8009: "Thank you, Bridge"
8025: folks don't harbor
8068: Hot library
8157: Korean lari
8185: Physically
8802: "Well, I'm green"
9084: "Pen wouldn't date, right?"
9312: London he threw airport
12083: "So Cook, Bye Boss!"
12105: Hi-Brian is LinkedIn
Thanks for this.OK, here's a badly auto-transcribed Female1 too. Some highlights:
478: Bristol Temple Meats
505: You can't count it
555: Dor just to south
566: Feeling Broadway
736: Lost with you
750: Market Hi-Brah
6459: Glundal sins in Crosby
6909: One Bruh!
8009: "Thank you, Bridge"
8025: folks don't harbor
8068: Hot library
8157: Korean lari
8185: Physically
8802: "Well, I'm green"
9084: "Pen wouldn't date, right?"
9312: London he threw airport
12083: "So Cook, Bye Boss!"
12105: Hi-Brian is LinkedIn
This is great work, it’s certainly evocative that these two voices can be preserved for posterity.
One quick question - is anyone able to have a go at the *.pcc file that TFL have provided for the Piccadilly Line Julie Berry 73 stock announcements?
I've been having a go, but the format is very strange. The best I have managed so far as attached. It sounds terrible, but I think that's a voice in there somewhere - which suggests it is some kind of PCM format, but possibly with more elaborate voice coding techniques (ADPCM/LPC/CELP) which I have yet to figure out.
Thank you. That is definitely Julie Berry that you can hear. It is extremely poor quality so you can not make out what she is saying but it is definitely her. That is the closest that we have gotten so far to accessing them so this is certainly a good start.I've been having a go, but the format is very strange. The best I have managed so far as attached. It sounds terrible, but I think that's a voice in there somewhere - which suggests it is some kind of PCM format, but possibly with more elaborate voice coding techniques (ADPCM/LPC/CELP) which I have yet to figure out.
I think these are the sound files that you are after:Has anybody the found recording of phil sayers security announcement. the one about “luggage and lost items may be removed and destroyed”
We've managed to convert them all to wav and MP3: https://github.com/Rail-Announcements/ketech-llpa-announcementsHow did you manage to play the .seg files without the program?
I only recall hearing the latter part when arriving into Gatwick, personally.When arriving into Victoria, after Julie says “please remember to take all your personal belongings with you when you leave the train”, the announcement used to continue with “and please do not leave unattended items of luggage in the train or on the station”….this second part is no longer played, anyone know when (or why) it was removed?