Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try to fix global / modulaized import ns conflict #2057

Merged
merged 10 commits into from
Apr 15, 2020

Conversation

a1trl9
Copy link
Contributor

@a1trl9 a1trl9 commented Mar 26, 2020

Hi,

Tried to fix #1853 by prioritizing global imports.

Please let me know if I made any mistakes (:

@a1trl9 a1trl9 closed this Mar 26, 2020
@a1trl9 a1trl9 reopened this Mar 26, 2020
@alexcrichton
Copy link
Contributor

Thanks for the PR!

I'm not super familiar with the issue here (a test would be great!) but it feels like the fix here should be with generating imports as opposed to reordering definitions?

@a1trl9
Copy link
Contributor Author

a1trl9 commented Mar 27, 2020

Thanks for the PR!

I'm not super familiar with the issue here (a test would be great!) but it feels like the fix here should be with generating imports as opposed to reordering definitions?

@alexcrichton Thank you for your comment! I think you are right. Will recheck my code and add tests.

@a1trl9
Copy link
Contributor Author

a1trl9 commented Mar 30, 2020

Hi @alexcrichton,

Would like to share two approaches I have tried the last few days for this issue and please don't hesitate to point out my mistakes.

About Issue

The minimal case, as presented in #1853, is like following:

use wasm_bindgen::prelude::*;

#[wasm_bindgen(module = "fs")]
extern "C" {
	#[wasm_bindgen(js_namespace = window)]
	fn f1();
}

#[wasm_bindgen]
extern "C" {
	#[wasm_bindgen(js_namespace = window)]
	fn f2();
}

#[wasm_bindgen]
pub fn yoyo() {
	f1();
	f2();
}

For now, an error would be thrown: cannot import window from two locations. But actually, as one of window is imported from module fs, we can generate code like:

import { window as window2 } from 'fs';
import * as wasm from './wasm_test_bg.wasm';

function notDefined(what) { return () => { throw new Error(`${what} is not defined`); }; }
/**
*/
export function yoyo() {
    wasm.yoyo();
}

export const __wbg_f1_5e436d22ea2eab51 = typeof window2.f1 == 'function' ? window2.f1 : notDefined('window2.f1');

export const __wbg_f2_a182d819e1c1a65f = typeof window.f2 == 'function' ? window.f2 : notDefined('window.f2');

That is, rename imported functions from module automatically, and leave global imports untouched.

Approach 1

As you suggested, I tried to figure out if it is possible to generate imports which prioritize global imports, so that I don't need to resort it afterwards when generating adapters like I did last week. The main challenge for this approach is, it seems like all imports are processed linearly in a loop, and imports are stored separately in programs, which again are processed and generated linearly. The code structure is like:

// code from cli-support/src/wit/mod.rs

// get decoded programs    
let programs = extract_programs(module, &mut storage)?;

// try to process program one by one.
for program in programs {
      cx.program(program)?;
}

fn program(...) {
      // nested, process import in a loop
      for import in imports {
           self.import(import)?;
      }
}

fn import(&mut self, import: decode::Import<'_>) -> Result<(), Error> {
      match &import.kind {
            decode::ImportKind::Function(f) => self.import_function(&import, f),
            decode::ImportKind::Static(s) => self.import_static(&import, s),
            decode::ImportKind::Type(t) => self.import_type(&import, t),
            decode::ImportKind::Enum(_) => Ok(()),
     }
}

Adapter info and instructions are built by function import_adapter, which called by import_function, import_static and import_type (Hope I didn't miss something).

So, overall, it looks like unless I rewrite functions of how programs decoded and processed (maybe even backend logic such as codegen, encode?), or how imports are processed (e.g. extract all imports first and process them alongside other logic in program), I can't get an import list, or adapter list which prioritizing global imports at the beginning. I feel it might be a big change for this project. For example, for now, in the import_adapter function, the multi-step process makes an adapter with AdapterKind::Local {..} always follows the one with AdapterKind::Import {..}. Thus, some refactoring like above might alter the order of the whole adapter list, not only the import part.

Approach 2

For approach 2, I tried to leave all generation logic untouched, and not reorder adapters. Instead, build a new hashmap: global_defined_import_identifiers to store all global import keys ahead of import name generation. Now for function generate_identifier, two more parameters are required (one for global_defined_import_identifiers and another is simply a flag to show if the processed name is from a global import). Inside the function, two kinds of import name are handled separately. For non-global one, an index space would be left if any global imports with the same name exist. For global one, it now only considers conflicts with other global ones (but does this case exist now?) You can see my recent commits for more details.

This approach is not elegant. Any advice is welcomed if you think there are much better solutions, or this one actually misses some cases.

I also added one minimal test in cli crate, and tested some more cases manually, for example, with default, multiple global imports from the same namespace, etc., to double-check current behaviour is not broken:

default

use wasm_bindgen::prelude::*;

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_namespace = Math)]
    fn random() -> f64;
}

#[wasm_bindgen(module = "fs")]
extern "C" {
	#[wasm_bindgen(js_namespace = window)]
	fn f1();

	#[wasm_bindgen(js_namespace = default)]
	fn f2();
}

#[wasm_bindgen]
pub fn test() {
    f1();
    f2();
}
import { window, default as default1 } from 'fs';
import * as wasm from './wasm_test_bg.wasm';

function notDefined(what) { return () => { throw new Error(`${what} is not defined`); }; }
/**
*/
export function test() {
    wasm.test();
}

export const __wbg_f1_5e436d22ea2eab51 = typeof window.f1 == 'function' ? window.f1 : notDefined('window.f1');

export const __wbg_f2_a4247dfb755d7558 = typeof default1.f2 == 'function' ? default1.f2 : notDefined('default1.f2');

multiple global imports with same namespace

use wasm_bindgen::prelude::*;

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_namespace = Math)]
    fn random() -> f64;
}

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_namespace = Math)]
    fn log(a: f64) -> f64;
}

#[wasm_bindgen]
pub fn test() {
    random();
    log(3.2);
}
import * as wasm from './wasm_test_bg.wasm';

function notDefined(what) { return () => { throw new Error(`${what} is not defined`); }; }
/**
*/
export function test() {
    wasm.test();
}

export const __wbg_random_cba1f145a042ae34 = typeof Math.random == 'function' ? Math.random : notDefined('Math.random');

export const __wbg_log_7bde4e1bb51408eb = typeof Math.log == 'function' ? Math.log : notDefined('Math.log');

Same name between import namespace and export

use wasm_bindgen::prelude::*;

#[wasm_bindgen(module = "fs")]
extern "C" {
	#[wasm_bindgen(js_namespace = window)]
	fn f1();
}

#[wasm_bindgen]
pub fn window() {
    f1();
}
import { window } from 'fs';
import * as wasm from './wasm_test_bg.wasm';

function notDefined(what) { return () => { throw new Error(`${what} is not defined`); }; }
/**
*/
function window2() {
    wasm.window();
}
export { window2 as window };

export const __wbg_f1_5e436d22ea2eab51 = typeof window.f1 == 'function' ? window.f1 : notDefined('window.f1');

I am looking forward to any advice.

@alexcrichton
Copy link
Contributor

Thanks for the update! I think that a solution along the lines of this is the way to go, either by reordering how imports are bound or precalculating information about imports rather than generating them on the fly.

Could this be made a method instead of a standalone function to avoid passing maps around? Additionally could you explain the counts on globals a bit more? There's some logic around numbering that I don't fully understand myself.

@a1trl9
Copy link
Contributor Author

a1trl9 commented Apr 11, 2020

@alexcrichton Thanks for your comment.

UPDATE:

  • Now generate_identifier has been moved into Context so we don't need to pass either defined_identifiers or global_defined_import_identifiers. One issue here is since now generate_identifier is a private method, I didn't figure out how to test it directly like before. I added more related tests into import.rs / import.js to check if js_namespace and js_name conflicts could be handled correctly. Hope it could mitigate the drawback.
  • Some comments added for numbering.

@@ -3118,6 +3163,37 @@ impl<'a> Context<'a> {
fn adapter_name(&self, id: AdapterId) -> String {
format!("__wbg_adapter_{}", id.0)
}

fn generate_identifier(&mut self, name: &str, global_import: bool) -> String {
let cnt = (&mut self.defined_identifiers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this leading &mut may not be necessary?

if let Some(js) = js {
match &js.name {
JsImportName::Global { name } => {
self.global_defined_import_identifiers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this extra map is necessary? Could this just call generate_identifier for name here?

Once we generate an identifier for each global name I think every other name naturally gets a numerical suffix?

Copy link
Contributor Author

@a1trl9 a1trl9 Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment.

I feel it might not work in some cases. Since function import_name is called on the fly during iteration, some non-global import might be processed first and therefore produces and stores a name with a numerical suffix if name conflict with another global import occurs. Then afterwards when processing the global one it would be hard to figure out if the cached name in the hashmap comes from a global or non-global one (for the first case, an error would be thrown). Therefore separating hashmaps might be an easy option.

Handling global name conflicts right here might not work perfectly as well. Since at this stage, we still don't know if the import is actually field by dot-access rather than name itself. I guess this is why the following if-condition is tested at the beginning of import_name:

        if let Some(name) = self.imported_names.get(&import.name) {
            let mut name = name.clone();
            for field in import.fields.iter() {
                name.push_str(".");
                name.push_str(field);
            }
            return Ok(name.clone());
        }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er sorry but what I mean is that you're already iterating over the global names explicitly first here, so instead of adding a second map to track this couldn't this pre-populate the map with only global names during this first pass?

Copy link
Contributor Author

@a1trl9 a1trl9 Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE
Well I think it's me not get things right... Yes it would be much better to just use generate_identifier here.

Also, since now all global imports are generated first, I feel it isn't necessary to bailout for global import name conflicts in function import_name anymore -- non-global imports with same name would be added numeric suffix automatically, while for same-name global imports, the hashmap imported_names and following if-condition at the beginning ofimport_name could handle them well:

        if let Some(name) = self.imported_names.get(&import.name) {
            let mut name = name.clone();
            for field in import.fields.iter() {
                name.push_str(".");
                name.push_str(field);
            }
            return Ok(name.clone());
        }

code updated.


Emm... my bad not make it clear.
Assuming we have imports like [Module { module: 'xxx_module', name: "Math"}, xxx, xxx, xxx, Global {name: "Math"}].
If here we count the global name Math in defined_identifiers instead of a new one. Then during calling the function import_name one by one, the Math from the module xxx_module would be processed first and increment *cnt of Math again in the same map:

            JsImportName::Module { module, name } => {
                // here, generate_identifier would increment *cnt for name `Math`
                let unique_name = self.generate_identifier(name, false);
                self.add_module_import(module.clone(), name, &unique_name);
                unique_name
            }

Afterwards, when processing the global import, the following code would break:

            JsImportName::Global { name } => {
                // current code, for clarify.
                // this wouldn't work surely, we have generated global names before.
                let unique_name = self.generate_identifier(name, true);
                if unique_name != *name {
                    bail!("cannot import `{}` from two locations", name);
                }
                // but modifying to the following code still not work
                // since non-global import `Math`
                // has been inserted to the same map, *cnt = 2. However the name conlifcts
                // are not from two global imports, bail is not expected.
                let cnt = self.defined_identifiers.entry(name.to_string()).or_insert(0)
                if *cnt > 1 {
                   bail!("cannot import `{}` from two locations", name);
                }
                unique_name
            }

} else {
Some(actual.to_string())
};
(&mut self.js_imports)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the leading &mut can be removed here

if let Some(js) = js {
match &js.name {
JsImportName::Global { name } => {
self.generate_identifier(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will still want to forward to import_name which does caching and error handling, it should just only call that method with JsImportName::Global values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sry but I don't quite understand what you suggest by "it should just only call that method with JsImportName::Global values." Could you please explain a little bit more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so my point is that I don't think we want to call generate_identifier as a bare function since it always generates a new identifier. This means that if the same global is referenced multiple times it'll throw off generation. Instead this should use import_name which has error handling and caching which should be used instead of the raw generate_identifier

@alexcrichton alexcrichton merged commit ad85de5 into rustwasm:master Apr 15, 2020
@alexcrichton
Copy link
Contributor

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't imported aliased names in some circumstances
2 participants