Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported: '^' from integregular when string has pattern in json schema #931

Closed
ekagra-ranjan opened this issue May 30, 2024 · 3 comments · Fixed by #995
Closed

Unsupported: '^' from integregular when string has pattern in json schema #931

ekagra-ranjan opened this issue May 30, 2024 · 3 comments · Fixed by #995
Labels

Comments

@ekagra-ranjan
Copy link

ekagra-ranjan commented May 30, 2024

Describe the issue as clearly as possible:

Hi! When I use json schema in outlines that has regex pattern inside a string type, it tries to run interegular.parse_pattern(regex_string) and fails with Unsupported: '^'. It happens when regex pattern inside string has ^ or $.

Example json schema

{
  "type": "object",
  "properties": {
    "address": {
      "type": "object",
      "properties": {
        "postalCode": {
          "type": "string",
          "pattern": "^\d{5}$"
        }
      }
    }
  }
}

This seems like a limitation of interegular but Outlines supports pattern within string type so was wondering if regex pattern in json schema is really supported or not?

Also, it seems we wrap the pattern inside ^ and $ with double quotes. This would convert ^\d{5}$ to ^"\d{5}"$ . We can let the user decide if its string or numbers?

cc: @lapp0

Steps/code to reproduce the bug:

import interegular
pattern = "^\d{5}$"
new_pattern = rf'(^"{pattern[1:-1]}"$)' 
print(new_pattern)
regex_pattern = interegular.parse_pattern(new_pattern)

Expected result:

no error

Error message:

Unsupported: '^'

Outlines/Python version information:

Version information

``` (command output here) ```

Context for the issue:

No response

@ekagra-ranjan
Copy link
Author

Hi @brandonwillard @lapp0 can someone confirm this please? Thanks!

@lapp0
Copy link
Collaborator

lapp0 commented Jun 13, 2024

Hi @ekagra-ranjan

Interegular explicitly doesn't support ^ or $ and has tests asserting this https://github.com/MegaIng/interegular/blob/758f83721c1ac306b96b7a706b69716a67ed954b/tests/test_patterns.py#L83-L84

I'm not sure why outlines converts ^foo$ into ^"foo"$, this seems to be wrong, they should probably be stripped.

Are you able to make your pattern work by removing the illegal ^ and $?

@ekagra-ranjan
Copy link
Author

Are you able to make your pattern work by removing the illegal ^ and $?

yes, that works.

I'm not sure why outlines converts ^foo$ into ^"foo"$, this seems to be wrong, they should probably be stripped.

this is what confuses me too. I guess this was not intended and should be been stripped as you said

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants