天道酬勤,学无止境

regex

Removal of large list of strings from a text

Suppose that txt='Daniel Johnson and Ana Hickman are friends. They know each other for a long time. Daniel Johnson is a professor and Ana Hickman is writer.' is a large piece of text and I want to remove a big list of strings such as removalLists=['Daniel Johnson','Ana Hickman'] from them. I mean I want to replace all the elements in the list by ' ' I know that I can do it easily using a loop for such as for string in removalLists: txt=re.sub(string,' ',txt) I wonder if I can do it faster.

2021-09-24 17:50:55    分类:问答    regex   python-3.x

python re can't find this grouped name

I try to give advice on the format of paper reference. For example, for academic dissertation, the format is: author. dissertation name[D]. place where store it: organization who hold the copy, year in which the dissertation published. obviously, there may be some punctuation in every items except for year. for example Smith. The paper name. The subtitle of paper[D]. United States: MIT, 2011 often, place where store it and year are missed, for example Smith. The paper name. The subtitle of paper[D]. US, 2011 Smith. The paper name. The subtitle of paper[D]. US: MIT I want to program like this

2021-09-24 17:27:11    分类:问答    python   regex   named

Confusion related to regular expression [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 4 months ago. Improve this question I have this confusion related to regular expression. If there are two sets A and B then is (AB)* = A*B*?

2021-09-24 16:04:14    分类:问答    regex   regular-language

Splitting a column in a DataFrame based on multiple possible delimiters

I have an address column in a dataframe in pandas with 3 types of information namely street, colony and city. There are three values with two possible delimiters - either a ',' or a white-space e.g it can be either Street1,Colony1,City1 or Street1 Colony1 City1. I need to split this column into three with respective labels 'Street','Colony' and 'City' with the values from this Address column split accordingly. What is the most efficient way to do this as the pandas split function only allows you with a single delimiter or a regex expression (maybe a regex expression for this as I'm not very

2021-09-24 08:42:50    分类:问答    python   regex   pandas

RegExp Headache for mixed text/digits

I'm doing 'mustache light' templating with the data coming out of a database with Javascript. The data looks like this: As you can see in {{Figure 1-1}}, and again in {{Figures 1-2}} and {{1-3}}, the hozzfrazz is much cleaner than the hooble stick. My regexp is \{\{[-a-zA-Z0-9\s]+\}\}/gi, which to my mind captures all of the above mustaches. But the only one being recognized in my function is {{1-3}}, not the other two. Any help? Apparently I need to add my function as well, since the regex works: var mReg = new RegExp("\{\{[-a-zA-Z0-9\s]+\}\}"); var link_tpl = "<a href='#' rel='@$' class=

2021-09-24 08:05:25    分类:问答    javascript   regex

How to using String.split in this case?

I want to write a function like that: - Input: "1" -> return : "1" - Input: "12" -> return : ["1","2"] If I use the function split(): String.valueOf("12").split("") -> ["","1","2"] But, I only want to get the result: ["1","2"]. What the best way to do this? Infact, I can do that: private List<String> decomposeQuantity(final int quantity) { LinkedList<String> list = new LinkedList<String>(); int parsedQuantity = quantity; while (parsedQuantity > 0) { list.push(String.valueOf(parsedQuantity % 10)); parsedQuantity = parsedQuantity / 10; } return list; } But, I want to use split() for having an

2021-09-24 08:03:04    分类:问答    java   regex

Replace every occurrence of a word with each line from an list-file

I want to name each .tar file in the code below and name it based on a list-file that contains the names, but I don't want to mess with the exclusion tag. I have a list file and I was thinking of using a text editor and adding cvf to the beginning of each line I have in a list and then use sed to replace the string cvf, thus adding the flag and then the name follows. i.e. cvf name1 cvf name2 cvf name3 I tried using sed 's/cvf/word2/g' input.file and as expected it only replaces cvf with the replacement word. I want the replacement word (word2) to change and to be each line from a list file

2021-09-24 07:49:24    分类:问答    regex   linux   bash   sed

How to read VCard file?

I want to read VCard file . I use this sample. but when i use this solution for this file . BEGIN:VCARD VERSION:4.0 N:Gump;Forrest;;; FN: Forrest Gump ORG:Bubba Gump Shrimp Co. TITLE:Shrimp Man PHOTO:http://www.example.com/dir_photos/my_photo.gif TEL;TYPE=work,voice;VALUE=uri:tel:+1-111-555-1212 TEL;TYPE=home,voice;VALUE=uri:tel:+1-404-555-1212 ADR;TYPE=work;LABEL="42 Plantation St.\nBaytown, LA 30314\nUnited States of America" :;;42 Plantation St.;Baytown;LA;30314;United States of America EMAIL:forrestgump@example.com REV:20080424T195243Z END:VCARD don't find parametr of Email/Phone/Address

2021-09-24 07:48:47    分类:问答    c#   .net   regex

How to convert a regex from PCRE to POSIX format, that warns about repetition-operator operand invalid?

Trying to anonymize received headers for relayed messages from authenticated postfix users, there is an example from https://we.riseup.net/debian/anonymizing-postfix: /^Received: from (.* \([-._[:alnum:]]+ \[[.[:digit:]]{7,15}\]\)).*?([[:space:]]+).*\(Authenticated sender: ([^)]+)\).*by (auk\.riseup\.net) \(([^)]+)\) with (E?SMTPS?A?) id ([A-F[:digit:]]+).*/ REPLACE Received: from [127.0.0.1] (localhost [127.0.0.1])$2(Authenticated sender: $3)${2}with $6 id $7 When editing the file regexp:/etc/postfix/header_checks the result is an error message: line 15: repetition-operator operand invalid

2021-09-24 07:35:50    分类:问答    regex   posix   pcre   postfix-mta

How to use RegEx in an if statement in Python?

I'm doing something like "Syntax Analyzer" with Kivy, using re (regular expresions). I only want to check a valid syntax for basic operations (like +|-|*|/|(|)). The user tape the string (with keyboard) and I validate it with regex. But I don't know how to use regex in an if statement. That I want is: If the string that user brings me isn't correct (or doesn't check with regex) print something like "inavlid string" and if is correct print "Valid string". I've tried with: if re.match(patron, string) is not None: print ("\nTrue") else: print("False") but, it doesn't matter what do string has

2021-09-24 07:32:56    分类:问答    python   regex