RuntimeError: Invalid uid - 9174794 at position=0

Hi,

I have wrote down the following script:

# Récupérer les données du csv contenant la liste des pmid par auteur :
with open("D:/Nancy/Pèse-Savants/Excercice Covid-19/Exercice 3/pmid_par_auteur.csv",'r', encoding='utf-8') as f:   
    # Séparer la liste des auteurs et des pmid en 2 colonnes distinctes :
    with open ("pmid_par_auteur_uniformise.csv", "w", encoding='utf-8') as fu:
        csv_f = csv.reader(f, delimiter = ';')
        for ligne in csv_f: 
            fu.write(ligne[0] + '\n')

auteur_pmid_doi = []

# Nettoyer les données encodées en 'utf-8'          
with open("pmid_par_auteur_uniformise.csv",encoding='utf-8') as fu:
    csv_fu = csv.reader(fu)
    
    for ligne in csv_fu:
        ligne[1] = ligne[1].replace("'", " ")
        ligne[1] = ligne[1].replace("[", " ")
        ligne[1] = ligne[1].replace("]", " ")
        ligne[1] = ligne[1].split(" , ")
        
        pmid_doi = []

        for pmid in ligne[1]:
            time.sleep(0.01)
            try : 
                handle = Entrez.esummary(db="pubmed", id=pmid) 
                record = Entrez.read(handle) 
                record = record[0]['DOI']
            except IndexError :
                print ('Missing DOI')
            except KeyError :
                print ('Missing DOI')
            else :
                pmid_doi.append([pmid, record])
            
        auteur_pmid_doi.append([ligne[0], pmid_doi])

auteur_pmid_doi

When I test my script on a small samle of data, it works well, but when I test it on the big data, i receive the folllowing message erreor: RuntimeError:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-38242e0b715e> in <module>
     25             try :
     26                 handle = Entrez.esummary(db="pubmed", id=pmid)
---> 27                 record = Entrez.read(handle)
     28                 record = record[0]['DOI']
     29             except IndexError :

~\anaconda3\lib\site-packages\Bio\Entrez\__init__.py in read(handle, validate, escape)
    485 
    486     handler = DataHandler(validate, escape)
--> 487     record = handler.read(handle)
    488     return record
    489 

~\anaconda3\lib\site-packages\Bio\Entrez\Parser.py in read(self, handle)
    343                 handle = BytesIO(_as_bytes(handle.read()))
    344         try:
--> 345             self.parser.ParseFile(handle)
    346         except expat.ExpatError as e:
    347             if self.parser.StartElementHandler:

c:\ci\python_1578510570019\work\modules\pyexpat.c in EndElement()

~\anaconda3\lib\site-packages\Bio\Entrez\Parser.py in endErrorElementHandler(self, name)
    694             # error found:
    695             value = "".join(self.data)
--> 696             raise RuntimeError(value)
    697         # no error found:
    698         if self.element is not None:

RuntimeError: Invalid uid 
9174794 at position=0

I change des API key and tried the script on another computer but still the same error occurs as result.

May someone help me please?

Thanks in advance.

Taking a quick look at the Bio.Entrez page, I see one likely problem in your script.

At no point do you invoke handle.close(), as shown on that page. A logical guess would be that handles are a finite resource; with a small dataset you didn’t run out of handles, but with a large dataset you exhausted the handle supply.

You are more likely to get useful assistance with BioPython via their mailing lists.

1 Like